Information

  • Author Services

Initiatives

You are accessing a machine-readable page. In order to be human-readable, please install an RSS reader.

All articles published by MDPI are made immediately available worldwide under an open access license. No special permission is required to reuse all or part of the article published by MDPI, including figures and tables. For articles published under an open access Creative Common CC BY license, any part of the article may be reused without permission provided that the original article is clearly cited. For more information, please refer to https://www.mdpi.com/openaccess .

Feature papers represent the most advanced research with significant potential for high impact in the field. A Feature Paper should be a substantial original Article that involves several techniques or approaches, provides an outlook for future research directions and describes possible research applications.

Feature papers are submitted upon individual invitation or recommendation by the scientific editors and must receive positive feedback from the reviewers.

Editor’s Choice articles are based on recommendations by the scientific editors of MDPI journals from around the world. Editors select a small number of articles recently published in the journal that they believe will be particularly interesting to readers, or important in the respective research area. The aim is to provide a snapshot of some of the most exciting work published in the various research areas of the journal.

Original Submission Date Received: .

  • Active Journals
  • Find a Journal
  • Proceedings Series
  • For Authors
  • For Reviewers
  • For Editors
  • For Librarians
  • For Publishers
  • For Societies
  • For Conference Organizers
  • Open Access Policy
  • Institutional Open Access Program
  • Special Issues Guidelines
  • Editorial Process
  • Research and Publication Ethics
  • Article Processing Charges
  • Testimonials
  • Preprints.org
  • SciProfiles
  • Encyclopedia

jimaging-logo

Article Menu

research about image processing

  • Subscribe SciFeed
  • Recommended Articles
  • PubMed/Medline
  • Google Scholar
  • on Google Scholar
  • Table of Contents

Find support for a specific problem in the support section of our website.

Please let us know what you think of our products and services.

Visit our dedicated information section to learn more about MDPI.

JSmol Viewer

Developments in image processing using deep learning and reinforcement learning.

research about image processing

1. Introduction

2. methodology, 2.1. search process and sources of information, 2.2. inclusion and exclusion criteria for article selection, 3. technical background, 3.1. graphics processing units, 3.2. image processing, 3.3. machine learning overview.

  • In supervised learning, we can determine predictive functions using labeled training datasets, meaning each data object instance must include an input for both the values and the expected labels or output values [ 21 ]. This class of algorithms tries to identify the relationships between input and output values and generate a predictive model able to determine the result based only on the corresponding input data [ 3 , 21 ]. Supervised learning methods are suitable for regression and data classification, being primarily used for a variety of algorithms like linear regression, artificial neural networks (ANNs), decision trees (DTs), support vector machines (SVMs), k-nearest neighbors (KNNs), random forest (RF), and others [ 3 ]. As an example, systems using RF and DT algorithms have developed a huge impact on areas such as computational biology and disease prediction, while SVM has also been used to study drug–target interactions and to predict several life-threatening diseases, such as cancer or diabetes [ 23 ].
  • Unsupervised learning is typically used to solve several problems in pattern recognition based on unlabeled training datasets. Unsupervised learning algorithms are able to classify the training data into different categories according to their different characteristics [ 21 , 24 ], mainly based on clustering algorithms [ 24 ]. The number of categories is unknown, and the meaning of each category is unclear; therefore, unsupervised learning is usually used for classification problems and for association mining. Some commonly employed algorithms include K-means [ 3 ], SVM, or DT classifiers. Data processing tools like PCA, which is used for dimensionality reduction, are often necessary prerequisites before attempting to cluster a set of data.

3.3.1. Deep Learning Concepts

  • Training a DNN implies the definition of a loss function, which is responsible for calculating the error made in the process given by the difference between the expected output value and that produced by the network. One of the most used loss functions in regression problems is the mean squared error (MSE) [ 30 ]. In the training phase, the weight vector that minimizes the loss function is adjusted, meaning it is not possible to obtain analytical solutions effectively. The loss function minimization method usually used is gradient descent [ 30 ].
  • Activation functions are fundamental in the process of learning neural network models, as well as in the interpretation of complex nonlinear functions. The activation function adds nonlinear features to the model, allowing it to represent more than one linear function, which would not happen otherwise, no matter how many layers it had. The Sigmoid function is the most commonly used activation function in the early stages of studying neural networks [ 30 ].
  • As their capacity to learn and adjust to data is greater than that of traditional ML models, it is more likely that overfitting situations will occur in DL models. For this reason, regularization represents a crucial and highly effective set of techniques used to reduce the generalization errors in ML. Some other techniques that can contribute to achieving this goal are increasing the size of the training dataset, stopping at an early point in the training phase, or randomly discarding a portion of the output of neurons during the training phase [ 30 ].
  • In order to increase stability and reduce convergence times in DL algorithms, optimizers are used, with which greater efficiency in the hyperparameter adjustment process is also possible [ 30 ].

3.3.2. Reinforcement Learning Concepts

3.4. current challenges, 4. image processing developments, 4.1. domains, 4.1.1. research using deep learning.

  • One of the first DL models used for video prediction, inspired by the sequence-to-sequence model usually used in natural language processing [ 97 ], uses a recurrent long and short term memory network (LSTM) to predict future images based on a sequence of images encoded during video data processing [ 97 ].
  • In their research, Salahzadeh et al. [ 98 ] presented a novel mechatronics platform for static and real-time posture analysis, combining 3 complex components. The components included a mechanical structure with cameras, a software module for data collection and semi-automatic image analysis, and a network to provide the raw data to the DL server. The authors concluded that their device, in addition to being inexpensive and easy to use, is a method that allows postural assessment with great stability and in a non-invasive way, proving to be a useful tool in the rehabilitation of patients.
  • Studies in graphical search engines and content-based image retrieval (CBIR) systems have also been successfully developed recently [ 11 , 82 , 99 , 100 ], with processing times that might be compatible with real-time applications. Most importantly, the corresponding results of these studies appeared to show adequate image retrieval capabilities, displaying an undisputed similarity between input and output, both on a semantic basis and a graphical basis [ 82 ]. In a review by Latif et al. [ 101 ], the authors concluded that image feature representation, as it is performed, is impossible to be represented by using a unique feature representation. Instead, it should be achieved by a combination of said low-level features, considering they represent the image in the form of patches and, as such, the performance is increased.
  • In their publication, Rani et al. [ 102 ] reviewed the current literature found on this topic from the period from 1995 to 2021. The authors found that researchers in microbiology have employed ML techniques for the image recognition of four types of micro-organisms: bacteria, algae, protozoa, and fungi. In their research work, Kasinathan and Uyyala [ 17 ] apply computer vision and knowledge-based approaches to improve insect detection and classification in dense image scenarios. In this work, image processing techniques were applied to extract features, and classification models were built using ML algorithms. The proposed approach used different feature descriptors, such as texture, color, shape, histograms of oriented gradients (HOG) and global image descriptors (GIST). ML was used to analyze multivariety insect data to obtain the efficient utilization of resources and improved classification accuracy for field crop insects with a similar appearance.

4.1.2. Research Using Reinforcement Learning

5. discussion and future directions, 6. conclusions.

  • Interest in image-processing systems using DL methods has exponentially increased over the last few years. The most common research disciplines for image processing and AI are medicine, computer science, and engineering.
  • Traditional ML methods are still extremely relevant and are frequently used in fields such as computational biology and disease diagnosis and prediction or to assist in specific tasks when coupled with other more complex methods. DL methods have become of particular interest in many image-processing problems, particularly because of their ability to circumvent some of the challenges that more traditional approaches face.
  • A lot of attention from researchers seems to focus on improving model performance, reducing computational resources and time, and expanding the application of ML models to solve concrete real-world problems.
  • The medical field seems to have developed a particular interest in research using multiple classes and methods of learning algorithms. DL image processing has been useful in analyzing medical exams and other imaging applications. Some areas have also still found success using more traditional ML methods.
  • Another area of interest appears to be autonomous driving and driver profiling, possibly powered by the increased access to information available both for the drivers and the vehicles alike. Indeed, modern driving assistance systems have already implemented features such as (a) road lane finding, (b) free driving space finding, (c) traffic sign detection and recognition, (d) traffic light detection and recognition, and (e) road-object detection and tracking. This research field will undoubtedly be responsible for many more studies in the near future.
  • Graphical search engines and content-based image retrieval systems also present themselves as an interesting topic of research for image processing, with a diverse body of work and innovative approaches.

Author Contributions

Institutional review board statement, informed consent statement, data availability statement, acknowledgments, conflicts of interest, abbreviations.

AIArtificial Inteligence
MLMachine Learning
DLDeep Learning
CBIRContent Based Image Retrieval
CNNConvolutional Neural Network
DNNDeep Neural Network
DCNNDeep Convolution Neural Network
RGBRed, Green, and Blue
  • Raschka, S.; Patterson, J.; Nolet, C. Machine Learning in Python: Main Developments and Technology Trends in Data Science, Machine Learning, and Artificial Intelligence. Information 2020 , 11 , 193. [ Google Scholar ] [ CrossRef ]
  • Barros, D.; Moura, J.; Freire, C.; Taleb, A.; Valentim, R.; Morais, P. Machine learning applied to retinal image processing for glaucoma detection: Review and perspective. BioMed. Eng. OnLine 2020 , 19 , 20. [ Google Scholar ] [ CrossRef ]
  • Zhu, M.; Wang, J.; Yang, X.; Zhang, Y.; Zhang, L.; Ren, H.; Wu, B.; Ye, L. A review of the application of machine learning in water quality evaluation. Eco-Environ. Health 2022 , 1 , 107–116. [ Google Scholar ] [ CrossRef ]
  • Singh, V.; Chen, S.S.; Singhania, M.; Nanavati, B.; kumar kar, A.; Gupta, A. How are reinforcement learning and deep learning algorithms used for big data based decision making in financial industries–A review and research agenda. Int. J. Inf. Manag. Data Insights 2022 , 2 , 100094. [ Google Scholar ] [ CrossRef ]
  • Moscalu, M.; Moscalu, R.; Dascălu, C.G.; Țarcă, V.; Cojocaru, E.; Costin, I.M.; Țarcă, E.; Șerban, I.L. Histopathological Images Analysis and Predictive Modeling Implemented in Digital Pathology—Current Affairs and Perspectives. Diagnostics 2023 , 13 , 2379. [ Google Scholar ] [ CrossRef ] [ PubMed ]
  • Wang, S.; Yang, D.M.; Rong, R.; Zhan, X.; Fujimoto, J.; Liu, H.; Minna, J.; Wistuba, I.I.; Xie, Y.; Xiao, G. Artificial Intelligence in Lung Cancer Pathology Image Analysis. Cancers 2019 , 11 , 1673. [ Google Scholar ] [ CrossRef ]
  • van der Velden, B.H.M.; Kuijf, H.J.; Gilhuijs, K.G.; Viergever, M.A. Explainable artificial intelligence (XAI) in deep learning-based medical image analysis. Med. Image Anal. 2022 , 79 , 102470. [ Google Scholar ] [ CrossRef ]
  • Prevedello, L.M.; Halabi, S.S.; Shih, G.; Wu, C.C.; Kohli, M.D.; Chokshi, F.H.; Erickson, B.J.; Kalpathy-Cramer, J.; Andriole, K.P.; Flanders, A.E. Challenges related to artificial intelligence research in medical imaging and the importance of image analysis competitions. Radiol. Artif. Intell. 2019 , 1 , e180031. [ Google Scholar ] [ CrossRef ]
  • Smith, K.P.; Kirby, J.E. Image analysis and artificial intelligence in infectious disease diagnostics. Clin. Microbiol. Infect. 2020 , 26 , 1318–1323. [ Google Scholar ] [ CrossRef ]
  • Wu, Q. Research on deep learning image processing technology of second-order partial differential equations. Neural Comput. Appl. 2023 , 35 , 2183–2195. [ Google Scholar ] [ CrossRef ]
  • Jardim, S.; António, J.; Mora, C. Graphical Image Region Extraction with K-Means Clustering and Watershed. J. Imaging 2022 , 8 , 163. [ Google Scholar ] [ CrossRef ]
  • Ying, C.; Huang, Z.; Ying, C. Accelerating the image processing by the optimization strategy for deep learning algorithm DBN. EURASIP J. Wirel. Commun. Netw. 2018 , 232 , 232. [ Google Scholar ] [ CrossRef ]
  • Protopapadakis, E.; Voulodimos, A.; Doulamis, A.; Doulamis, N.; Stathaki, T. Automatic crack detection for tunnel inspection using deep learning and heuristic image post-processing. Appl. Intell. 2019 , 49 , 2793–2806. [ Google Scholar ] [ CrossRef ]
  • Yong, B.; Wang, C.; Shen, J.; Li, F.; Yin, H.; Zhou, R. Automatic ventricular nuclear magnetic resonance image processing with deep learning. Multimed. Tools Appl. 2021 , 80 , 34103–34119. [ Google Scholar ] [ CrossRef ]
  • Freeman, W.; Jones, T.; Pasztor, E. Example-based super-resolution. IEEE Comput. Graph. Appl. 2002 , 22 , 56–65. [ Google Scholar ] [ CrossRef ]
  • Rodellar, J.; Alférez, S.; Acevedo, A.; Molina, A.; Merino, A. Image processing and machine learning in the morphological analysis of blood cells. Int. J. Lab. Hematol. 2018 , 40 , 46–53. [ Google Scholar ] [ CrossRef ] [ PubMed ]
  • Kasinathan, T.; Uyyala, S.R. Machine learning ensemble with image processing for pest identification and classification in field crops. Neural Comput. Appl. 2021 , 33 , 7491–7504. [ Google Scholar ] [ CrossRef ]
  • Yadav, P.; Gupta, N.; Sharma, P.K. A comprehensive study towards high-level approaches for weapon detection using classical machine learning and deep learning methods. Expert Syst. Appl. 2023 , 212 , 118698. [ Google Scholar ] [ CrossRef ]
  • Suganyadevi, S.; Seethalakshmi, V.; Balasamy, K. Reinforcement learning coupled with finite element modeling for facial motion learning. Int. J. Multimed. Inf. Retr. 2022 , 11 , 19–38. [ Google Scholar ] [ CrossRef ]
  • Zeng, Y.; Guo, Y.; Li, J. Recognition and extraction of high-resolution satellite remote sensing image buildings based on deep learning. Neural Comput. Appl. 2022 , 34 , 2691–2706. [ Google Scholar ] [ CrossRef ]
  • Pratap, A.; Sardana, N. Machine learning-based image processing in materials science and engineering: A review. Mater. Today Proc. 2022 , 62 , 7341–7347. [ Google Scholar ] [ CrossRef ]
  • Mahesh, B. Machine Learning Algorithms—A Review. Int. J. Sci. Res. 2020 , 9 , 1–6. [ Google Scholar ] [ CrossRef ]
  • Singh, D.P.; Kaushik, B. Machine learning concepts and its applications for prediction of diseases based on drug behaviour: An extensive review. Chemom. Intell. Lab. Syst. 2022 , 229 , 104637. [ Google Scholar ] [ CrossRef ]
  • Lillicrap, T.P.; Hunt, J.J.; Pritzel, A.; Heess, N.; Erez, T.; Tassa, Y.; Silver, D.; Wierstra, D. Continuous control with deep reinforcement learning. In Proceedings of the 4th International Conference on Learning Representations 2016, San Juan, Puerto Rico, 2–4 May 2016. [ Google Scholar ] [ CrossRef ]
  • Dworschak, F.; Dietze, S.; Wittmann, M.; Schleich, B.; Wartzack, S. Reinforcement Learning for Engineering Design Automation. Adv. Eng. Inform. 2022 , 52 , 101612. [ Google Scholar ] [ CrossRef ]
  • Khan, T.; Tian, W.; Zhou, G.; Ilager, S.; Gong, M.; Buyya, R. Machine learning (ML)-centric resource management in cloud computing: A review and future directions. J. Netw. Comput. Appl. 2022 , 204 , 103405. [ Google Scholar ] [ CrossRef ]
  • Botvinick, M.; Ritter, S.; Wang, J.X.; Kurth-Nelson, Z.; Blundell, C.; Hassabis, D. Reinforcement Learning, Fast and Slow. Trends Cogn. Sci. 2019 , 23 , 408–422. [ Google Scholar ] [ CrossRef ]
  • Moravčík, M.; Schmid, M.; Burch, N.; Lisý, V.; Morrill, D.; Bard, N.; Davis, T.; Waugh, K.; Johanson, M.; Bowling, M. DeepStack: Expert-level artificial intelligence in heads-up no-limit poker. Science 2017 , 356 , 508–513. [ Google Scholar ] [ CrossRef ]
  • ElDahshan, K.A.; Farouk, H.; Mofreh, E. Deep Reinforcement Learning based Video Games: A Review. In Proceedings of the 2nd International Mobile, Intelligent, and Ubiquitous Computing Conference (MIUCC), Cairo, Egypt, 8–9 May 2022. [ Google Scholar ] [ CrossRef ]
  • Huawei Technologies Co., Ltd. Overview of Deep Learning. In Artificial Intelligence Technology ; Springer: Singapore, 2023; Chapter 1–4; pp. 87–122. [ Google Scholar ] [ CrossRef ]
  • Le, N.; Rathour, V.S.; Yamazaki, K.; Luu, K.; Savvides, M. Deep reinforcement learning in computer vision: A comprehensive survey. Artif. Intell. Rev. 2022 , 55 , 2733–2819. [ Google Scholar ] [ CrossRef ]
  • Melanthota, S.K.; Gopal, D.; Chakrabarti, S.; Kashyap, A.A.; Radhakrishnan, R.; Mazumder, N. Deep learning-based image processing in optical microscopy. Biophys. Rev. 2022 , 14 , 463–481. [ Google Scholar ] [ CrossRef ]
  • Winovich, N.; Ramani, K.; Lin, G. ConvPDE-UQ: Convolutional neural networks with quantified uncertainty for heterogeneous elliptic partial differential equations on varied domains. J. Comput. Phys. 2019 , 394 , 263–279. [ Google Scholar ] [ CrossRef ]
  • Pham, H.; Warin, X.; Germain, M. Neural networks-based backward scheme for fully nonlinear PDEs. SN Partial. Differ. Equ. Appl. 2021 , 2 , 16. [ Google Scholar ] [ CrossRef ]
  • Wei, X.; Jiang, S.; Li, Y.; Li, C.; Jia, L.; Li, Y. Defect Detection of Pantograph Slide Based on Deep Learning and Image Processing Technology. IEEE Trans. Intell. Transp. Syst. 2020 , 21 , 947–958. [ Google Scholar ] [ CrossRef ]
  • E, W.; Yu, B. The deep ritz method: A deep learning based numerical algorithm for solving variational problems. Commun. Math. Stat. 2018 , 6 , 1–12. [ Google Scholar ] [ CrossRef ]
  • Yamashita, R.; Nishio, M.; Do, R.K.G.; Togashi, K. Convolutional neural networks: An overview and application in radiology. Insights Imaging 2018 , 9 , 611–629. [ Google Scholar ] [ CrossRef ] [ PubMed ]
  • Archarya, U.; Oh, S.; Hagiwara, Y.; Tan, J.; Adam, M.; Gertych, A.; Tan, R. A deep convolutional neural network model to classify heartbeats. Comput. Biol. Med. 2021 , 89 , 389–396. [ Google Scholar ] [ CrossRef ]
  • Ha, V.K.; Ren, J.C.; Xu, X.Y.; Zhao, S.; Xie, G.; Masero, V.; Hussain, A. Deep Learning Based Single Image Super-resolution: A Survey. Int. J. Autom. Comput. 2019 , 16 , 413–426. [ Google Scholar ] [ CrossRef ]
  • Jeong, C.Y.; Yang, H.S.; Moon, K. Fast horizon detection in maritime images using region-of-interest. Int. J. Distrib. Sens. Netw. 2018 , 14 , 1550147718790753. [ Google Scholar ] [ CrossRef ]
  • Olmos, R.; Tabik, S.; Lamas, A.; Pérez-Hernández, F.; Herrera, F. A binocular image fusion approach for minimizing false positives in handgun detection with deep learning. Inf. Fusion 2019 , 49 , 271–280. [ Google Scholar ] [ CrossRef ]
  • Zhao, X.; Wu, Y.; Tian, J.; Zhang, H. Single Image Super-Resolution via Blind Blurring Estimation and Dictionary Learning. Neurocomputing 2016 , 212 , 3–11. [ Google Scholar ] [ CrossRef ]
  • Qi, C.; Song, C.; Xiao, F.; Song, S. Generalization ability of hybrid electric vehicle energy management strategy based on reinforcement learning method. Energy 2022 , 250 , 123826. [ Google Scholar ] [ CrossRef ]
  • Ritto, T.; Beregi, S.; Barton, D. Reinforcement learning and approximate Bayesian computation for model selection and parameter calibration applied to a nonlinear dynamical system. Mech. Syst. Signal Process. 2022 , 181 , 109485. [ Google Scholar ] [ CrossRef ]
  • Hwang, R.; Lee, H.; Hwang, H.J. Option compatible reward inverse reinforcement learning. Pattern Recognit. Lett. 2022 , 154 , 83–89. [ Google Scholar ] [ CrossRef ]
  • Ladosz, P.; Weng, L.; Kim, M.; Oh, H. Exploration in deep reinforcement learning: A survey. Inf. Fusion 2022 , 85 , 1–22. [ Google Scholar ] [ CrossRef ]
  • Khayyat, M.M.; Elrefaei, L.A. Deep reinforcement learning approach for manuscripts image classification and retrieval. Multimed. Tools Appl. 2022 , 81 , 15395–15417. [ Google Scholar ] [ CrossRef ]
  • Nguyen, D.P.; Ho Ba Tho, M.C.; Dao, T.T. A review on deep learning in medical image analysis. Comput. Methods Programs Biomed. 2022 , 221 , 106904. [ Google Scholar ] [ CrossRef ]
  • Laskin, M.; Lee, K.; Stooke, A.; Pinto, L.; Abbeel, P.; Srinivas, A. Reinforcement Learning with Augmented Data. In Proceedings of the 34th Conference on Neural Information Processing Systems 2020, Vancouver, BC, Canada, 6–12 December 2020; pp. 19884–19895. [ Google Scholar ]
  • Li, H.; Xu, H. Deep reinforcement learning for robust emotional classification in facial expression recognition. Knowl.-Based Syst. 2020 , 204 , 106172. [ Google Scholar ] [ CrossRef ]
  • Gomes, G.; Vidal, C.A.; Cavalcante-Neto, J.B.; Nogueira, Y.L. A modeling environment for reinforcement learning in games. Entertain. Comput. 2022 , 43 , 100516. [ Google Scholar ] [ CrossRef ]
  • Georgeon, O.L.; Casado, R.C.; Matignon, L.A. Modeling Biological Agents beyond the Reinforcement-learning Paradigm. Procedia Comput. Sci. 2015 , 71 , 17–22. [ Google Scholar ] [ CrossRef ]
  • Yin, S.; Liu, H. Wind power prediction based on outlier correction, ensemble reinforcement learning, and residual correction. Energy 2022 , 250 , 123857. [ Google Scholar ] [ CrossRef ]
  • Badia, A.P.; Piot, B.; Kapturowski, S.; Sprechmann, P.; Vitvitskyi, A.; Guo, D.; Blundell, C. Agent57: Outperforming the Atari Human Benchmark. arXiv 2020 , arXiv:2003.13350. [ Google Scholar ] [ CrossRef ]
  • Zong, K.; Luo, C. Reinforcement learning based framework for COVID-19 resource allocation. Comput. Ind. Eng. 2022 , 167 , 107960. [ Google Scholar ] [ CrossRef ]
  • Mnih, V.; Kavukcuoglu, K.; Silver, D.; Rusu, A.A.; Veness, J.; Bellemare, M.G.; Graves, A.; Riedmiller, M.; Fidjeland, A.K.; Ostrovski, G.; et al. Human-level control through deep reinforcement learning. Nature 2015 , 518 , 529–533. [ Google Scholar ] [ CrossRef ] [ PubMed ]
  • Ren, J.; Guan, F.; Li, X.; Cao, J.; Li, X. Optimization for image stereo-matching using deep reinforcement learning in rule constraints and parallax estimation. Neural Comput. Appl. 2023 , 1–11. [ Google Scholar ] [ CrossRef ]
  • Morales, E.F.; Murrieta-Cid, R.; Becerra, I.; Esquivel-Basaldua, M.A. A survey on deep learning and deep reinforcement learning in robotics with a tutorial on deep reinforcement learning. Intell. Serv. Robot. 2021 , 14 , 773–805. [ Google Scholar ] [ CrossRef ]
  • Krizhevsky, A.; Sutskever, I.; Hinton, G.E. ImageNet Classification with Deep Convolutional Neural Networks. Commun. ACM 2017 , 60 , 84–90. [ Google Scholar ] [ CrossRef ]
  • Krichen, M. Convolutional Neural Networks: A Survey. Computers 2023 , 12 , 151. [ Google Scholar ] [ CrossRef ]
  • Song, D.; Kim, T.; Lee, Y.; Kim, J. Image-Based Artificial Intelligence Technology for Diagnosing Middle Ear Diseases: A Systematic Review. J. Clin. Med. 2023 , 12 , 5831. [ Google Scholar ] [ CrossRef ]
  • Muñoz-Saavedra, L.; Escobar-Linero, E.; Civit-Masot, J.; Luna-Perejón, F.; Civit, A.; Domínguez-Morales, M. A Robust Ensemble of Convolutional Neural Networks for the Detection of Monkeypox Disease from Skin Images. Sensors 2023 , 23 , 7134. [ Google Scholar ] [ CrossRef ] [ PubMed ]
  • Wang, Y.; Hargreaves, C.A. A Review Study of the Deep Learning Techniques used for the Classification of Chest Radiological Images for COVID-19 Diagnosis. Int. J. Inf. Manag. Data Insights 2022 , 2 , 100100. [ Google Scholar ] [ CrossRef ]
  • Teng, Y.; Pan, D.; Zhao, W. Application of deep learning ultrasound imaging in monitoring bone healing after fracture surgery. J. Radiat. Res. Appl. Sci. 2023 , 16 , 100493. [ Google Scholar ] [ CrossRef ]
  • Zaghari, N.; Fathy, M.; Jameii, S.M.; Sabokrou, M.; Shahverdy, M. Improving the learning of self-driving vehicles based on real driving behavior using deep neural network techniques. J. Supercomput. 2021 , 77 , 3752–3794. [ Google Scholar ] [ CrossRef ]
  • Farag, W. Cloning Safe Driving Behavior for Self-Driving Cars using Convolutional Neural Networks. Recent Patents Comput. Sci. 2019 , 11 , 120–127. [ Google Scholar ] [ CrossRef ]
  • Agyemang, I.; Zhang, X.; Acheampong, D.; Adjei-Mensah, I.; Kusi, G.; Mawuli, B.C.; Agbley, B.L. Autonomous health assessment of civil infrastructure using deep learning and smart devices. Autom. Constr. 2022 , 141 , 104396. [ Google Scholar ] [ CrossRef ]
  • Zhou, S.; Canchila, C.; Song, W. Deep learning-based crack segmentation for civil infrastructure: Data types, architectures, and benchmarked performance. Autom. Constr. 2023 , 146 , 104678. [ Google Scholar ] [ CrossRef ]
  • Guerrieri, M.; Parla, G. Flexible and stone pavements distress detection and measurement by deep learning and low-cost detection devices. Eng. Fail. Anal. 2022 , 141 , 106714. [ Google Scholar ] [ CrossRef ]
  • Hoang, N.; Nguyen, Q. A novel method for asphalt pavement crack classification based on image processing and machine learning. Eng. Comput. 2019 , 35 , 487–498. [ Google Scholar ] [ CrossRef ]
  • Tabrizi, S.E.; Xiao, K.; Van Griensven Thé, J.; Saad, M.; Farghaly, H.; Yang, S.X.; Gharabaghi, B. Hourly road pavement surface temperature forecasting using deep learning models. J. Hydrol. 2021 , 603 , 126877. [ Google Scholar ] [ CrossRef ]
  • Jardim, S.V.B. Sparse and Robust Signal Reconstruction. Theory Appl. Math. Comput. Sci. 2015 , 5 , 1–19. [ Google Scholar ]
  • Jackulin, C.; Murugavalli, S. A comprehensive review on detection of plant disease using machine learning and deep learning approaches. Meas. Sens. 2022 , 24 , 100441. [ Google Scholar ] [ CrossRef ]
  • Keceli, A.S.; Kaya, A.; Catal, C.; Tekinerdogan, B. Deep learning-based multi-task prediction system for plant disease and species detection. Ecol. Inform. 2022 , 69 , 101679. [ Google Scholar ] [ CrossRef ]
  • Kotwal, J.; Kashyap, D.; Pathan, D. Agricultural plant diseases identification: From traditional approach to deep learning. Mater. Today Proc. 2023 , 80 , 344–356. [ Google Scholar ] [ CrossRef ]
  • Naik, A.; Thaker, H.; Vyas, D. A survey on various image processing techniques and machine learning models to detect, quantify and classify foliar plant disease. Proc. Indian Natl. Sci. Acad. 2021 , 87 , 191–198. [ Google Scholar ] [ CrossRef ]
  • Thaiyalnayaki, K.; Joseph, C. Classification of plant disease using SVM and deep learning. Mater. Today Proc. 2021 , 47 , 468–470. [ Google Scholar ] [ CrossRef ]
  • Carnegie, A.J.; Eslick, H.; Barber, P.; Nagel, M.; Stone, C. Airborne multispectral imagery and deep learning for biosecurity surveillance of invasive forest pests in urban landscapes. Urban For. Urban Green. 2023 , 81 , 127859. [ Google Scholar ] [ CrossRef ]
  • Hadipour-Rokni, R.; Askari Asli-Ardeh, E.; Jahanbakhshi, A.; Esmaili paeen-Afrakoti, I.; Sabzi, S. Intelligent detection of citrus fruit pests using machine vision system and convolutional neural network through transfer learning technique. Comput. Biol. Med. 2023 , 155 , 106611. [ Google Scholar ] [ CrossRef ]
  • Agrawal, P.; Chaudhary, D.; Madaan, V.; Zabrovskiy, A.; Prodan, R.; Kimovski1, D.; Timmerer, C. Automated bank cheque verification using image processing and deep learning methods. Multimed. Tools Appl. 2021 , 80 , 5319–5350. [ Google Scholar ] [ CrossRef ]
  • Gordo, A.; Almazán, J.; Revaud, J.; Larlus, D. Deep Image Retrieval: Learning Global Representations for Image Search. In Proceedings of the Computer Vision—ECCV 2016, Amsterdam, The Netherlands, 11–14 October 2016; Leibe, B., Matas, J., Sebe, N., Welling, M., Eds.; Springer International Publishing: Cham, Switzerland, 2016; pp. 241–257. [ Google Scholar ]
  • Jardim, S.; António, J.; Mora, C.; Almeida, A. A Novel Trademark Image Retrieval System Based on Multi-Feature Extraction and Deep Networks. J. Imaging 2022 , 8 , 238. [ Google Scholar ] [ CrossRef ]
  • Lin, K.; Yang, H.F.; Hsiao, J.H.; Chen, C.S. Deep learning of binary hash codes for fast image retrieval. In Proceedings of the 2015 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW), Boston, MA, USA, 7–12 June 2015; pp. 27–35. [ Google Scholar ] [ CrossRef ]
  • Andriasyan, V.; Yakimovich, A.; Petkidis, A.; Georgi, F.; Georgi, R.; Puntener, D.; Greber, U. Microscopy deep learning predicts virus infections and reveals mechanics of lytic-infected cells. iScience 2021 , 24 , 102543. [ Google Scholar ] [ CrossRef ]
  • Lüneburg, N.; Reiss, N.; Feldmann, C.; van der Meulen, P.; van de Steeg, M.; Schmidt, T.; Wendl, R.; Jansen, S. Photographic LVAD Driveline Wound Infection Recognition Using Deep Learning. In dHealth 2019—From eHealth to dHealth ; IOS Press: Amsterdam, The Netherlands, 2019; pp. 192–199. [ Google Scholar ] [ CrossRef ]
  • Fink, O.; Wang, Q.; Svensén, M.; Dersin, P.; Lee, W.J.; Ducoffe, M. Potential, challenges and future directions for deep learning in prognostics and health management applications. Eng. Appl. Artif. Intell. 2020 , 92 , 103678. [ Google Scholar ] [ CrossRef ]
  • Ahmed, I.; Ahmad, M.; Jeon, G. Social distance monitoring framework using deep learning architecture to control infection transmission of COVID-19 pandemic. Sustain. Cities Soc. 2021 , 69 , 102777. [ Google Scholar ] [ CrossRef ]
  • Hussain, S.; Yu, Y.; Ayoub, M.; Khan, A.; Rehman, R.; Wahid, J.A.; Hou, W. IoT and Deep Learning Based Approach for Rapid Screening and Face Mask Detection for Infection Spread Control of COVID-19. Appl. Sci. 2021 , 11 , 3495. [ Google Scholar ] [ CrossRef ]
  • Kaur, J.; Kaur, P. Outbreak COVID-19 in Medical Image Processing Using Deep Learning: A State-of-the-Art Review. Arch. Comput. Methods Eng. 2022 , 29 , 2351–2382. [ Google Scholar ] [ CrossRef ] [ PubMed ]
  • Groen, A.M.; Kraan, R.; Amirkhan, S.F.; Daams, J.G.; Maas, M. A systematic review on the use of explainability in deep learning systems for computer aided diagnosis in radiology: Limited use of explainable AI? Int. J. Autom. Comput. 2022 , 157 , 110592. [ Google Scholar ] [ CrossRef ] [ PubMed ]
  • Hao, D.; Li, Q.; Feng, Q.X.; Qi, L.; Liu, X.S.; Arefan, D.; Zhang, Y.D.; Wu, S. SurvivalCNN: A deep learning-based method for gastric cancer survival prediction using radiological imaging data and clinicopathological variables. Artif. Intell. Med. 2022 , 134 , 102424. [ Google Scholar ] [ CrossRef ]
  • Cui, X.; Zheng, S.; Heuvelmans, M.A.; Du, Y.; Sidorenkov, G.; Fan, S.; Li, Y.; Xie, Y.; Zhu, Z.; Dorrius, M.D.; et al. Performance of a deep learning-based lung nodule detection system as an alternative reader in a Chinese lung cancer screening program. Eur. J. Radiol. 2022 , 146 , 110068. [ Google Scholar ] [ CrossRef ]
  • Liu, L.; Li, C. Comparative study of deep learning models on the images of biopsy specimens for diagnosis of lung cancer treatment. J. Radiat. Res. Appl. Sci. 2023 , 16 , 100555. [ Google Scholar ] [ CrossRef ]
  • Muniz, F.B.; de Freitas Oliveira Baffa, M.; Garcia, S.B.; Bachmann, L.; Felipe, J.C. Histopathological diagnosis of colon cancer using micro-FTIR hyperspectral imaging and deep learning. Comput. Methods Programs Biomed. 2023 , 231 , 107388. [ Google Scholar ] [ CrossRef ]
  • Gomes, S.L.; de S. Rebouças, E.; Neto, E.C.; Papa, J.P.; de Albuquerque, V.H.C.; Filho, P.P.R.; Tavares, J.M.R.S. Embedded real-time speed limit sign recognition using image processing and machine learning techniques. Neural Comput. Appl. 2017 , 28 , 573–584. [ Google Scholar ] [ CrossRef ]
  • Monga, V.; Li, Y.; Eldar, Y.C. Algorithm Unrolling: Interpretable, Efficient Deep Learning for Signal and Image Processing. IEEE Signal Process. Mag. 2021 , 38 , 18–44. [ Google Scholar ] [ CrossRef ]
  • Zhang, L.; Cheng, L.; Li, H.; Gao, J.; Yu, C.; Domel, R.; Yang, Y.; Tang, S.; Liu, W.K. Hierarchical deep-learning neural networks: Finite elements and beyond. Comput. Mech. 2021 , 67 , 207–230. [ Google Scholar ] [ CrossRef ]
  • Salahzadeh, Z.; Rezaei-Hachesu, P.; Gheibi, Y.; Aghamali, A.; Pakzad, H.; Foladlou, S.; Samad-Soltani, T. A mechatronics data collection, image processing, and deep learning platform for clinical posture analysis: A technical note. Phys. Eng. Sci. Med. 2021 , 44 , 901–910. [ Google Scholar ] [ CrossRef ] [ PubMed ]
  • Singh, P.; Hrisheekesha, P.; Singh, V.K. CBIR-CNN: Content-Based Image Retrieval on Celebrity Data Using Deep Convolution Neural Network. Recent Adv. Comput. Sci. Commun. 2021 , 14 , 257–272. [ Google Scholar ] [ CrossRef ]
  • Varga, D.; Szirányi, T. Fast content-based image retrieval using convolutional neural network and hash function. In Proceedings of the 2016 IEEE International Conference on Systems, Man, and Cybernetics (SMC), Budapest, Hungary, 9–12 October 2016; pp. 2636–2640. [ Google Scholar ] [ CrossRef ]
  • Latif, A.; Rasheed, A.; Sajid, U.; Ahmed, J.; Ali, N.; Ratyal, N.I.; Zafar, B.; Dar, S.H.; Sajid, M.; Khalil, T. Content-Based Image Retrieval and Feature Extraction: A Comprehensive Review. Math. Probl. Eng. 2019 , 2019 , 9658350. [ Google Scholar ] [ CrossRef ]
  • Rani, P.; Kotwal, S.; Manhas, J.; Sharma, V.; Sharma, S. Machine Learning and Deep Learning Based Computational Approaches in Automatic Microorganisms Image Recognition: Methodologies, Challenges, and Developments. Arch. Comput. Methods Eng. 2022 , 29 , 1801–1837. [ Google Scholar ] [ CrossRef ]
  • Jardim, S.V.B.; Figueiredo, M.A.T. Automatic Analysis of Fetal Echographic Images. Proc. Port. Conf. Pattern Recognit. 2002 , 1 , 1–6. [ Google Scholar ]
  • Jardim, S.V.B.; Figueiredo, M.A.T. Automatic contour estimation in fetal ultrasound images. In Proceedings of the 2003 International Conference on Image Processing 2003, Barcelona, Spain, 14–17 September 2003; Volum 1, pp. 1065–1068. [ Google Scholar ] [ CrossRef ]
  • Devunooru, S.; Alsadoon, A.; Chandana, P.W.C.; Beg, A. Deep learning neural networks for medical image segmentation of brain tumours for diagnosis: A recent review and taxonomy. J. Ambient Intell. Humaniz. Comput. 2021 , 12 , 455–483. [ Google Scholar ] [ CrossRef ]
  • Anaya-Isaza, A.; Mera-Jiménez, L.; Verdugo-Alejo, L.; Sarasti, L. Optimizing MRI-based brain tumor classification and detection using AI: A comparative analysis of neural networks, transfer learning, data augmentation, and the cross-transformer network. Eur. J. Radiol. Open 2023 , 10 , 100484. [ Google Scholar ] [ CrossRef ]
  • Cao, Y.; Kunaprayoon, D.; Xu, J.; Ren, L. AI-assisted clinical decision making (CDM) for dose prescription in radiosurgery of brain metastases using three-path three-dimensional CNN. Clin. Transl. Radiat. Oncol. 2023 , 39 , 100565. [ Google Scholar ] [ CrossRef ]
  • Chakrabarty, N.; Mahajan, A.; Patil, V.; Noronha, V.; Prabhash, K. Imaging of brain metastasis in non-small-cell lung cancer: Indications, protocols, diagnosis, post-therapy imaging, and implications regarding management. Clin. Radiol. 2023 , 78 , 175–186. [ Google Scholar ] [ CrossRef ]
  • Mehrotra, R.; Ansari, M.; Agrawal, R.; Anand, R. A Transfer Learning approach for AI-based classification of brain tumors. Mach. Learn. Appl. 2020 , 2 , 100003. [ Google Scholar ] [ CrossRef ]
  • Drai, M.; Testud, B.; Brun, G.; Hak, J.F.; Scavarda, D.; Girard, N.; Stellmann, J.P. Borrowing strength from adults: Transferability of AI algorithms for paediatric brain and tumour segmentation. Eur. J. Radiol. 2022 , 151 , 110291. [ Google Scholar ] [ CrossRef ] [ PubMed ]
  • Ranjbarzadeh, R.; Caputo, A.; Tirkolaee, E.B.; Jafarzadeh Ghoushchi, S.; Bendechache, M. Brain tumor segmentation of MRI images: A comprehensive review on the application of artificial intelligence tools. Comput. Biol. Med. 2023 , 152 , 106405. [ Google Scholar ] [ CrossRef ] [ PubMed ]
  • Yedder, H.B.; Cardoen, B.; Hamarneh, G. Deep learning for biomedical image reconstruction: A survey. Artif. Intell. Rev. 2021 , 54 , 215–251. [ Google Scholar ] [ CrossRef ]
  • Manuel Davila Delgado, J.; Oyedele, L. Robotics in construction: A critical review of the reinforcement learning and imitation learning paradigms. Adv. Eng. Inform. 2022 , 54 , 101787. [ Google Scholar ] [ CrossRef ]
  • Íñigo Elguea-Aguinaco; Serrano-Muñoz, A.; Chrysostomou, D.; Inziarte-Hidalgo, I.; Bøgh, S.; Arana-Arexolaleiba, N. A review on reinforcement learning for contact-rich robotic manipulation tasks. Robot. Comput.-Integr. Manuf. 2023 , 81 , 102517. [ Google Scholar ] [ CrossRef ]
  • Ahn, K.H.; Na, M.; Song, J.B. Robotic assembly strategy via reinforcement learning based on force and visual information. Robot. Auton. Syst. 2023 , 164 , 104399. [ Google Scholar ] [ CrossRef ]
  • Jafari, M.; Xu, H.; Carrillo, L.R.G. A biologically-inspired reinforcement learning based intelligent distributed flocking control for Multi-Agent Systems in presence of uncertain system and dynamic environment. IFAC J. Syst. Control 2020 , 13 , 100096. [ Google Scholar ] [ CrossRef ]
  • Wang, X.; Liu, S.; Yu, Y.; Yue, S.; Liu, Y.; Zhang, F.; Lin, Y. Modeling collective motion for fish schooling via multi-agent reinforcement learning. Ecol. Model. 2023 , 477 , 110259. [ Google Scholar ] [ CrossRef ]
  • Jain, D.K.; Dutta, A.K.; Verdú, E.; Alsubai, S.; Sait, A.R.W. An automated hyperparameter tuned deep learning model enabled facial emotion recognition for autonomous vehicle drivers. Image Vis. Comput. 2023 , 133 , 104659. [ Google Scholar ] [ CrossRef ]
  • Silver, D.; Hubert, T.; Schrittwieser, J.; Antonoglou, I.; Lai, M.; Guez, A.; Lanctot, M.; Sifre, L.; Kumaran, D.; Graepel, T.; et al. A general reinforcement learning algorithm that masters chess, shogi, and Go through self-play. Science 2018 , 362 , 1140–1144. [ Google Scholar ] [ CrossRef ]
  • Ueda, M. Memory-two strategies forming symmetric mutual reinforcement learning equilibrium in repeated prisoners’ dilemma game. Appl. Math. Comput. 2023 , 444 , 127819. [ Google Scholar ] [ CrossRef ]
  • Wang, X.; Liu, F.; Ma, X. Mixed distortion image enhancement method based on joint of deep residuals learning and reinforcement learning. Signal Image Video Process. 2021 , 15 , 995–1002. [ Google Scholar ] [ CrossRef ]
  • Dai, Y.; Wang, G.; Muhammad, K.; Liu, S. A closed-loop healthcare processing approach based on deep reinforcement learning. Multimed. Tools Appl. 2022 , 81 , 3107–3129. [ Google Scholar ] [ CrossRef ]

Click here to enlarge figure

The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

Valente, J.; António, J.; Mora, C.; Jardim, S. Developments in Image Processing Using Deep Learning and Reinforcement Learning. J. Imaging 2023 , 9 , 207. https://doi.org/10.3390/jimaging9100207

Valente J, António J, Mora C, Jardim S. Developments in Image Processing Using Deep Learning and Reinforcement Learning. Journal of Imaging . 2023; 9(10):207. https://doi.org/10.3390/jimaging9100207

Valente, Jorge, João António, Carlos Mora, and Sandra Jardim. 2023. "Developments in Image Processing Using Deep Learning and Reinforcement Learning" Journal of Imaging 9, no. 10: 207. https://doi.org/10.3390/jimaging9100207

Article Metrics

Article access statistics, further information, mdpi initiatives, follow mdpi.

MDPI

Subscribe to receive issue release notifications and newsletters from MDPI journals

Research Topics

Biomedical Imaging

Biomedical Imaging

The current plethora of imaging technologies such as magnetic resonance imaging (MR), computed tomography (CT), position emission tomography (PET), optical coherence tomography (OCT), and ultrasound provide great insight into the different anatomical and functional processes of the human body.

Computer Vision

Computer Vision

Computer vision is the science and technology of teaching a computer to interpret images and video as well as a typical human. Technically, computer vision encompasses the fields of image/video processing, pattern recognition, biological vision, artificial intelligence, augmented reality, mathematical modeling, statistics, probability, optimization, 2D sensors, and photography.

Image Segmentation/Classification

Image Segmentation/Classification

Extracting information from a digital image often depends on first identifying desired objects or breaking down the image into homogenous regions (a process called 'segmentation') and then assigning these objects to particular classes (a process called 'classification'). This is a fundamental part of computer vision, combining image processing and pattern recognition techniques.

Multiresolution Techniques

Multiresolution   Techniques

The VIP lab has a particularly extensive history with multiresolution methods, and a significant number of research students have explored this theme. Multiresolution methods are very broad, essentially meaning than an image or video is modeled, represented, or features extracted on more than one scale, somehow allowing both local and non-local phenomena.

Remote Sensing

Remote Sensing

Remote sensing, or the science of capturing data of the earth from airplanes or satellites, enables regular monitoring of land, ocean, and atmosphere expanses, representing data that cannot be captured using any other means. A vast amount of information is generated by remote sensing platforms and there is an obvious need to analyze the data accurately and efficiently.

Scientific Imaging

Scientific Imaging

Scientific Imaging refers to working on two- or three-dimensional imagery taken for a scientific purpose, in most cases acquired either through a microscope or remotely-sensed images taken at a distance.

Stochastic Models

Stochastic Models

In many image processing, computer vision, and pattern recognition applications, there is often a large degree of uncertainty associated with factors such as the appearance of the underlying scene within the acquired data, the location and trajectory of the object of interest, the physical appearance (e.g., size, shape, color, etc.) of the objects being detected, etc.

Video Analysis

Video Analysis

Video analysis is a field within  computer vision  that involves the automatic interpretation of digital video using computer algorithms. Although humans are readily able to interpret digital video, developing algorithms for the computer to perform the same task has been highly evasive and is now an active research field.

Deep Evolution Figure

Evolutionary Deep Intelligence

Deep learning has shown considerable promise in recent years, producing tremendous results and significantly improving the accuracy of a variety of challenging problems when compared to other machine learning methods.

Discovered Radiomics Sequencer

Discovery Radiomics

Radiomics, which involves the high-throughput extraction and analysis of a large amount of quantitative features from medical imaging data to characterize tumor phenotype in a quantitative manner, is ushering in a new era of imaging-driven quantitative personalized cancer decision support and management. 

Discovered Radiomics Sequencer

Sports Analytics

Sports Analytics is a growing field in computer vision that analyzes visual cues from images to provide statistical data on players, teams, and games. Want to know how a player's technique improves the quality of the team? Can a team, based on their defensive position, increase their chances to the finals? These are a few out of a plethora of questions that are answered in sports analytics.

Share via Facebook

  • Contact Waterloo
  • Maps & Directions
  • Accessibility

The University of Waterloo acknowledges that much of our work takes place on the traditional territory of the Neutral, Anishinaabeg, and Haudenosaunee peoples. Our main campus is situated on the Haldimand Tract, the land granted to the Six Nations that includes six miles on each side of the Grand River. Our active work toward reconciliation takes place across our campuses through research, learning, teaching, and community building, and is co-ordinated within the Office of Indigenous Relations .

facebook

  • Skip to primary navigation
  • Skip to main content

OpenCV

Open Computer Vision Library

Computer Vision and Image Processing: Understanding the Distinction and Interconnection

Explore the essentials of Computer Vision and Image Processing in this easy-to-follow guide. Discover their unique roles and combined impact in today's tech-driven world, tailored for beginners.

Farooq Alvi December 13, 2023 Leave a Comment AI Careers Tags: computer vision and image processing difference between computer vision and image processing difference between image processing and computer vision digital image processing and computer vision introduction to computer vision and image processing what is computer vision and image processing

computer vision and image processing

In today’s digital world, computers are learning to ‘see’ and ‘understand’ images just like humans. But how do they do it? This fascinating journey involves two key fields: Computer Vision and Image Processing . While they may sound similar, they have distinct roles in the world of technology. Let’s dive in to understand these exciting fields better!

What is Image Processing?

The art of beautifying images.

Imagine you have a photograph that isn’t quite perfect – maybe it’s too dark, or the colors are dull. Image processing is like a magic wand that transforms this photo into a better version. It involves altering or improving digital images using various methods and tools. Think of it as editing a photo to make it look more appealing or to highlight certain features. It’s all about changing the image itself.

What is Computer Vision?

Teaching computers to interpret images.

Now, imagine a robot looking at the same photograph. Unlike humans, it doesn’t naturally understand what it’s seeing. This is where computer vision comes in. It’s like teaching the robot to recognize and understand the content of the image – is it a picture of a cat, a car, or a tree? Computer vision doesn’t change the image. Instead, it tries to make sense of it, much like how our brain interprets what our eyes see.

Core Principles & Techniques

Computer vision (cv): seeing beyond the surface.

In the realm of Computer Vision, the goal is to teach computers to understand and interpret visual information from the world around them. Let’s explore some of the key principles and techniques that make this possible:

Pattern Recognition

Think of this as teaching a computer to play a game of ‘spot the difference’. By recognizing patterns, computers can identify similarities and differences in images. This skill is crucial for tasks like facial recognition or identifying objects in a scene.

Deep Learning

Deep Learning is like giving a computer a very complex brain that learns from examples. By feeding it thousands, or even millions, of images, a computer learns to identify and understand various elements in these images. This is the backbone of modern computer vision, enabling machines to recognize objects, people, and even emotions.

Object Detection

This is where computers get really smart. Object detection is about identifying specific objects within an image. It’s like teaching a computer to not just see a scene, but to understand what each part of that scene is. For instance, in a street scene, it can distinguish cars, people, trees, and buildings.

Image Processing: Transforming Pixels into Perfection

In the world of Image Processing, the magic lies in altering and enhancing images to make them more useful or visually appealing. Let’s break down some of the fundamental principles and techniques:

Image Enhancement

This is like giving a makeover to an image. Image enhancement can brighten up a dark photo, bring out hidden details, or make colors pop. It’s all about improving the look and feel of an image to make it more pleasing or informative.

Imagine sifting through the ‘noise’ to find the real picture. Image filtering involves removing or reducing unwanted elements from an image, like blurring, smoothening rough edges, or sharpening blurry parts. It helps in cleaning up the image to highlight the important features.

Transformation Techniques

This is where an image can take on a new shape or form. Transformation techniques might include resizing an image, rotating it, or even warping it to change perspective. It’s like reshaping the image to fit a specific purpose or requirement.

These techniques form the toolbox of image processing, enabling us to manipulate and enhance images in countless ways.

Distinctions Between Computer Vision and Image Processing

Image processing: visual perfection.

The primary aim of image processing is to improve image quality. Whether it’s enhancing contrast, adjusting colors, or smoothing edges, the focus is on making the image more visually appealing or suitable for further use. It’s about transforming the raw image into a refined version of itself.

Image processing focuses on enhancing and transforming images. It’s vital in fields like digital photography for color correction, medical imaging for clearer scans, and graphic design for creating stunning visuals. These transformations not only improve aesthetics but also make images more suitable for analysis, laying the groundwork for deeper interpretation, including by computer vision systems.

Computer Vision: Decoding the Visual World

Computer vision, on the other hand, seeks to extract meaning from images. The goal isn’t to change how the image looks but to understand what the image represents. This involves identifying objects, interpreting scenes, and even recognizing patterns and behaviors within the image. It’s more about comprehension rather than alteration.

Computer Vision, conversely, aims to extract meaning and understanding from images. It’s at the heart of AI and robotics, helping machines recognize faces, interpret road scenes for autonomous vehicles, and understand human behavior. The success of these tasks often relies on the quality of image processing. High-quality, well-processed images can significantly enhance the accuracy of computer vision algorithms.

Techniques and Tools

Image processing techniques and tools.

In image processing, the toolkit includes a range of software and algorithms specifically designed for modifying images. This includes:

Software like Photoshop and GIMP, for manual edits such as retouching and resizing.

Algorithms for automated tasks like histogram equalization for contrast adjustment and filters for noise reduction and edge enhancement.

Computer Vision Techniques and Tools

Computer Vision, on the other hand, employs a different set of methodologies:

Machine Learning and Deep Learning Algorithms such as Convolutional Neural Networks (CNNs) are pivotal for tasks like image classification and object recognition.

Pattern Recognition Tools are used to identify and classify objects within an image, essential for applications like facial recognition.

Interconnection and Overlap: Synergy in Sight

This section illustrates the essential relationship between image processing and computer vision, showcasing their collaborative role in advanced technological applications.

Building Blocks: Image Processing in Computer Vision

Your Image Alt Text

Pre-processing in Computer Vision: Many computer vision algorithms require pre-processed images. Techniques like noise reduction and contrast enhancement from image processing improve the accuracy of computer vision tasks.

Feature Extraction: Simplified or enhanced images from image processing are easier for computer vision algorithms to analyze and interpret.

Integrated Systems: Collaborative Power

Both fields often work in tandem in complex systems:

Autonomous Vehicles: Computer vision systems rely on image processing to clarify and enhance road imagery for better object detection and obstacle avoidance.

Medical Imaging Analysis: Image processing is used to enhance medical images like MRIs or X-rays, which are then analyzed by computer vision algorithms for diagnosis and research.

Applications and Real-World Examples: Transforming Industries

Diverse Industries Benefiting from These Technologies

Medical Imaging: Image processing enhances medical scans for clarity, which are then analyzed by computer vision to detect abnormalities, aiding in early diagnosis and treatment planning.

Autonomous Vehicles: Utilize image processing for clear visual input, which is essential for computer vision systems to accurately identify and react to road signs, pedestrians, and other vehicles.

Surveillance

Security Systems: Image processing improves image quality from cameras, aiding computer vision in accurately recognizing faces or suspicious activities and enhancing security measures.

Entertainment

Film and Gaming: Image processing is used for visual effects, while computer vision contributes to interactive experiences, like augmented reality games.

Case Studies: Integrating Computer Vision and Image Processing

Smart city projects.

Traffic Management Systems: Utilize image processing to enhance traffic camera feeds, which are then analyzed by computer vision for managing traffic flow and detecting incidents.

Agricultural Technology

Crop Monitoring Systems: Image processing clarifies aerial images of crops, and computer vision analyzes these images to assess crop health and growth, optimizing agricultural practices.

These examples and case studies highlight the impactful and transformative role of image processing and computer vision across various sectors, demonstrating their critical contribution to technological advancements.

Conclusion: The Convergence of Vision and Processing in the Digital Age

summary, Computer Vision and Image Processing, though distinct in their goals and techniques, are interconnected fields that play a pivotal role in the advancement of modern technology. Image processing sets the stage by enhancing and transforming images, which are then interpreted and understood through computer vision. Together, they are revolutionizing industries such as healthcare, automotive, surveillance, and entertainment, driving innovation and opening new frontiers in technology.

Understanding these fields and their interplay is crucial for anyone looking to engage with the latest in tech development and application.

References and Further Reading

  • Why you need to start learning OpenCV in 2023
  • A Deep Dive into AI Jobs in 2023
  • History of AI
  • Top 7 AI Applications
  • Why you should absolutely learn PyTorch 2023
  • Exploring the 7 Types of AI in 2023
  • Career in AI

Related Posts

introduction to ai jobs in 2023

August 16, 2023    Leave a Comment

introduction to artificial intelligence

August 23, 2023    Leave a Comment

Knowing the history of AI is important in understanding where AI is now and where it may go in the future.

August 30, 2023    Leave a Comment

Become a Member

Stay up to date on OpenCV and Computer Vision news

Free Courses

  • TensorFlow & Keras Bootcamp
  • OpenCV Bootcamp
  • Python for Beginners
  • Mastering OpenCV with Python
  • Fundamentals of CV & IP
  • Deep Learning with PyTorch
  • Deep Learning with TensorFlow & Keras
  • Computer Vision & Deep Learning Applications
  • Mastering Generative AI for Art

Partnership

  • Intel, OpenCV’s Platinum Member
  • Gold Membership
  • Development Partnership

General Link

research about image processing

Subscribe and Start Your Free Crash Course

research about image processing

Stay up to date on OpenCV and Computer Vision news and our new course offerings

  • We hate SPAM and promise to keep your email address safe.

Join the waitlist to receive a 20% discount

Courses are (a little) oversubscribed and we apologize for your enrollment delay. As an apology, you will receive a 20% discount on all waitlist course purchases. Current wait time will be sent to you in the confirmation email. Thank you!

Accessibility Links

  • Skip to content
  • Skip to search IOPscience
  • Skip to Journals list
  • Accessibility help
  • Accessibility Help

Click here to close this panel.

Purpose-led Publishing is a coalition of three not-for-profit publishers in the field of physical sciences: AIP Publishing, the American Physical Society and IOP Publishing.

Together, as publishers that will always put purpose above profit, we have defined a set of industry standards that underpin high-quality, ethical scholarly communications.

We are proudly declaring that science is our only shareholder.

Overview of Research Progress of Digital Image Processing Technology

Yitao Huang 1

Published under licence by IOP Publishing Ltd Journal of Physics: Conference Series , Volume 2386 , The International Conference on Computing Innovation and Applied Physics (CONF-CIAP 2022), 20 August 2022, Online Citation Yitao Huang 2022 J. Phys.: Conf. Ser. 2386 012034 DOI 10.1088/1742-6596/2386/1/012034

Article metrics

1999 Total downloads

Share this article

Author e-mails.

[email protected]

Author affiliations

1 Department of Electronic Science and TechnologyTongji University, Shanghai, 201804, China

Buy this article in print

Digital image processing technology has gone through rapid development and is extensively applied in daily life and production, with the rapid development of modern information technology. It plays an inestimable role in remote sensing, medicine, recognition and other fields. This paper briefly introduces the basic concept of digital image processing, summarizes and analyses the commonly used digital image processing technology and the latest scientific research achievements from four aspects, and puts forward the future development direction of digital image processing. In the future, it will pay more attention to artificial intelligence algorithms and achieve better processing results by optimizing the logical structure. By using the simplified image algorithm, the application scope of digital image processing will gradually expand, and will develop in the direction of miniaturization, intelligence, and convenience.

Export citation and abstract BibTeX RIS

Content from this work may be used under the terms of the Creative Commons Attribution 3.0 licence . Any further distribution of this work must maintain attribution to the author(s) and the title of the work, journal citation and DOI.

Image Processing: Techniques, Types, & Applications [2023]

Rohit Kundu

Deep learning has revolutionized the world of computer vision—the ability for machines to “see” and interpret the world around them.

In particular, Convolutional Neural Networks (CNNs) were designed to process image data more efficiently than traditional Multi-Layer Perceptrons (MLP).

Since images contain a consistent pattern spanning several pixels, processing them one pixel at a time—as MLPs do—is inefficient.

This is why CNNs that process images in patches or windows are now the de-facto choice for image processing tasks.

But let’s start from the beginning—

research about image processing

‍ Here’s what we’ll cover:

What is Image Processing?

  • How Machines “See” Images?

Phases of Image Processing

Image processing techniques.

Turn images, PDFs, or free-form text into structured insights

Digital Image processing is the class of methods that deal with manipulating digital images through the use of computer algorithms. It is an essential preprocessing step in many applications, such as face recognition, object detection, and image compression.

Image processing is done to enhance an existing image or to sift out important information from it. This is important in several Deep Learning-based Computer Vision applications, where such preprocessing can dramatically boost the performance of a model. Manipulating images, for example, adding or removing objects to images, is another application, especially in the entertainment industry.

This paper addresses a medical image segmentation problem, where the authors used image inpainting in their preprocessing pipeline for the removal of artifacts from dermoscopy images. Examples of this operation are shown below.

research about image processing

The authors achieved a 3% boost in performance with this simple preprocessing procedure which is a considerable enhancement, especially in a biomedical application where the accuracy of diagnosis is crucial for AI systems. The quantitative results obtained with and without preprocessing for the lesion segmentation problem in three different datasets are shown below.

research about image processing

Types of Images / How Machines “See” Images?

Digital images are interpreted as 2D or 3D matrices by a computer, where each value or pixel in the matrix represents the amplitude, known as the “intensity” of the pixel. Typically, we are used to dealing with 8-bit images, wherein the amplitude value ranges from 0 to 255.

research about image processing

Thus, a computer “sees” digital images as a function: I(x, y) or I(x, y, z) , where “ I ” is the pixel intensity and (x, y) or (x, y, z) represent the coordinates (for binary/grayscale or RGB images respectively) of the pixel in the image.

research about image processing

Computers deal with different “types” of images based on their function representations. Let us look into them next.

1. Binary Image

Images that have only two unique values of pixel intensity- 0 (representing black) and 1 (representing white) are called binary images. Such images are generally used to highlight a discriminating portion of a colored image. For example, it is commonly used for image segmentation, as shown below.

research about image processing

2. Grayscale Image

Grayscale or 8-bit images are composed of 256 unique colors, where a pixel intensity of 0 represents the black color and pixel intensity of 255 represents the white color. All the other 254 values in between are the different shades of gray.

An example of an RGB image converted to its grayscale version is shown below. Notice that the shape of the histogram remains the same for the RGB and grayscale images.

research about image processing

3. RGB Color Image

The images we are used to in the modern world are RGB or colored images which are 16-bit matrices to computers. That is, 65,536 different colors are possible for each pixel. “RGB” represents the Red, Green, and Blue “channels” of an image.

Up until now, we had images with only one channel. That is, two coordinates could have defined the location of any value of a matrix. Now, three equal-sized matrices (called channels), each having values ranging from 0 to 255, are stacked on top of each other, and thus we require three unique coordinates to specify the value of a matrix element.

Thus, a pixel in an RGB image will be of color black when the pixel value is (0, 0, 0) and white when it is (255, 255, 255). Any combination of numbers in between gives rise to all the different colors existing in nature. For example, (255, 0, 0) is the color red (since only the red channel is activated for this pixel). Similarly, (0, 255, 0) is green and (0, 0, 255) is blue.

An example of an RGB image split into its channel components is shown below. Notice that the shapes of the histograms for each of the channels are different.

research about image processing

4. RGBA Image

RGBA images are colored RGB images with an extra channel known as “alpha” that depicts the opacity of the RGB image. Opacity ranges from a value of 0% to 100% and is essentially a “see-through” property.

Opacity in physics depicts the amount of light that passes through an object. For instance, cellophane paper is transparent (100% opacity), frosted glass is translucent, and wood is opaque. The alpha channel in RGBA images tries to mimic this property. An example of this is shown below.

research about image processing

The fundamental steps in any typical Digital Image Processing pipeline are as follows:

1. Image Acquisition

The image is captured by a camera and digitized (if the camera output is not digitized automatically) using an analogue-to-digital converter for further processing in a computer.

2. Image Enhancement

In this step, the acquired image is manipulated to meet the requirements of the specific task for which the image will be used. Such techniques are primarily aimed at highlighting the hidden or important details in an image, like contrast and brightness adjustment, etc. Image enhancement is highly subjective in nature.

3. Image Restoration

This step deals with improving the appearance of an image and is an objective operation since the degradation of an image can be attributed to a mathematical or probabilistic model. For example, removing noise or blur from images.

4. Color Image Processing

This step aims at handling the processing of colored images (16-bit RGB or RGBA images), for example, peforming color correction or color modeling in images.

5. Wavelets and Multi-Resolution Processing

Wavelets are the building blocks for representing images in various degrees of resolution. Images subdivision successively into smaller regions for data compression and for pyramidal representation.

6. Image Compression

For transferring images to other devices or due to computational storage constraints, images need to be compressed and cannot be kept at their original size. This is also important in displaying images over the internet; for example, on Google, a small thumbnail of an image is a highly compressed version of the original. Only when you click on the image is it shown in the original resolution. This process saves bandwidth on the servers.

7. Morphological Processing

Image components that are useful in the representation and description of shape need to be extracted for further processing or downstream tasks. Morphological Processing provides the tools (which are essentially mathematical operations) to accomplish this. For example, erosion and dilation operations are used to sharpen and blur the edges of objects in an image, respectively.

8. Image Segmentation

This step involves partitioning an image into different key parts to simplify and/or change the representation of an image into something that is more meaningful and easier to analyze. Image segmentation allows for computers to put attention on the more important parts of the image, discarding the rest, which enables automated systems to have improved performance.

9. Representation and Description

Image segmentation procedures are generally followed by this step, where the task for representation is to decide whether the segmented region should be depicted as a boundary or a complete region. Description deals with extracting attributes that result in some quantitative information of interest or are basic for differentiating one class of objects from another.

10. Object Detection and Recognition

After the objects are segmented from an image and the representation and description phases are complete, the automated system needs to assign a label to the object—to let the human users know what object has been detected, for example, “vehicle” or “person”, etc.

11. Knowledge Base

Knowledge may be as simple as the bounding box coordinates for an object of interest that has been found in the image, along with the object label assigned to it. Anything that will help in solving the problem for the specific task at hand can be encoded into the knowledge base.

V7 Go interface

Automate repetitive tasks and complex processes with AI

Image processing can be used to improve the quality of an image, remove undesired objects from an image, or even create new images from scratch. For example, image processing can be used to remove the background from an image of a person, leaving only the subject in the foreground.

Image processing is a vast and complex field, with many different algorithms and techniques that can be used to achieve different results. In this section, we will focus on some of the most common image processing tasks and how they are performed.

Task 1: Image Enhancement

One of the most common image processing tasks is an image enhancement, or improving the quality of an image. It has crucial applications in Computer Vision tasks, Remote Sensing, and surveillance. One common approach is adjusting the image's contrast and brightness. 

Contrast is the difference in brightness between the lightest and darkest areas of an image. By increasing the contrast, the overall brightness of an image can be increased, making it easier to see. Brightness is the overall lightness or darkness of an image. By increasing the brightness, an image can be made lighter, making it easier to see. Both contrast and brightness can be adjusted automatically by most image editing software, or they can be adjusted manually.

research about image processing

However, adjusting the contrast and brightness of an image are elementary operations. Sometimes an image with perfect contrast and brightness, when upscaled, becomes blurry due to lower pixel per square inch (pixel density). To address this issue, a relatively new and much more advanced concept of Image Super-Resolution is used, wherein a high-resolution image is obtained from its low-resolution counterpart(s). Deep Learning techniques are popularly used to accomplish this.

research about image processing

For example, the earliest example of using Deep Learning to address the Super-Resolution problem is the SRCNN model, where a low-resolution image is first upscaled using traditional Bicubic Interpolation and then used as the input to a CNN model. The non-linear mapping in the CNN extracts overlapping patches from the input image, and a convolution layer is fitted over the extracted patches to obtain the reconstructed high-resolution image. The model framework is depicted visually below.

research about image processing

An example of the results obtained by the SRCNN model compared to its contemporaries is shown below.

research about image processing

Task 2: Image Restoration

The quality of images could degrade for several reasons, especially photos from the era when cloud storage was not so commonplace. For example, images scanned from hard copies taken with old instant cameras often acquire scratches on them.

research about image processing

Image Restoration is particularly fascinating because advanced techniques in this area could potentially restore damaged historical documents. Powerful Deep Learning-based image restoration algorithms may be able to reveal large chunks of missing information from torn documents.

Image inpainting, for example, falls under this category, and it is the process of filling in the missing pixels in an image. This can be done by using a texture synthesis algorithm, which synthesizes new textures to fill in the missing pixels. However, Deep Learning-based models are the de facto choice due to their pattern recognition capabilities.

research about image processing

An example of an image painting framework (based on the U-Net autoencoder) was proposed in this paper that uses a two-step approach to the problem: a coarse estimation step and a refinement step. The main feature of this network is the Coherent Semantic Attention (CSA) layer that fills the occluded regions in the input images through iterative optimization. The architecture of the proposed model is shown below.

research about image processing

Some example results obtained by the authors and other competing models are shown below.

research about image processing

Task 3: Image Segmentation

Image segmentation is the process of partitioning an image into multiple segments or regions. Each segment represents a different object in the image, and image segmentation is often used as a preprocessing step for object detection.

There are many different algorithms that can be used for image segmentation, but one of the most common approaches is to use thresholding. Binary thresholding, for example, is the process of converting an image into a binary image, where each pixel is either black or white. The threshold value is chosen such that all pixels with a brightness level below the threshold are turned black, and all pixels with a brightness level above the threshold are turned white. This results in the objects in the image being segmented, as they are now represented by distinct black and white regions.

research about image processing

In multi-level thresholding, as the name suggests, different parts of an image are converted to different shades of gray depending on the number of levels. This paper , for example, used multi-level thresholding for medical imaging —specifically for brain MRI segmentation, an example of which is shown below.

research about image processing

Modern techniques use automated image segmentation algorithms using deep learning for both binary and multi-label segmentation problems. For example, the PFNet or Positioning and Focus Network is a CNN-based model that addresses the camouflaged object segmentation problem. It consists of two key modules—the positioning module (PM) designed for object detection (that mimics predators that try to identify a coarse position of the prey); and the focus module (FM) designed to perform the identification process in predation for refining the initial segmentation results by focusing on the ambiguous regions. The architecture of the PFNet model is shown below.

research about image processing

The results obtained by the PFNet model outperformed contemporary state-of-the-art models, examples of which are shown below.

research about image processing

Task 4: Object Detection

Object Detection is the task of identifying objects in an image and is often used in applications such as security and surveillance. Many different algorithms can be used for object detection, but the most common approach is to use Deep Learning models, specifically Convolutional Neural Networks (CNNs).

research about image processing

CNNs are a type of Artificial Neural Network that were specifically designed for image processing tasks since the convolution operation in their core helps the computer “see” patches of an image at once instead of having to deal with one pixel at a time. CNNs trained for object detection will output a bounding box (as shown in the illustration above) depicting the location where the object is detected in the image along with its class label.

An example of such a network is the popular Faster R-CNN ( R egion-based C onvolutional N eural N etwork) model, which is an end-to-end trainable, fully convolutional network. The Faster R-CNN model alternates between fine-tuning for the region proposal task (predicting regions in the image where an object might be present) and then fine-tuning for object detection (detecting what object is present) while keeping the proposals fixed. The architecture and some examples of region proposals are shown below.

research about image processing

Task 5: Image Compression

Image compression is the process of reducing the file size of an image while still trying to preserve the quality of the image. This is done to save storage space, especially to run Image Processing algorithms on mobile and edge devices, or to reduce the bandwidth required to transmit the image.

Traditional approaches use lossy compression algorithms, which work by reducing the quality of the image slightly in order to achieve a smaller file size. JPEG file format, for example, uses the Discrete Cosine Transform for image compression.

Modern approaches to image compression involve the use of Deep Learning for encoding images into a lower-dimensional feature space and then recovering that on the receiver’s side using a decoding network. Such models are called autoencoders , which consist of an encoding branch that learns an efficient encoding scheme and a decoder branch that tries to revive the image loss-free from the encoded features.

research about image processing

For example, this paper proposed a variable rate image compression framework using a conditional autoencoder. The conditional autoencoder is conditioned on the Lagrange multiplier, i.e., the network takes the Lagrange multiplier as input and produces a latent representation whose rate depends on the input value. The authors also train the network with mixed quantization bin sizes for fine-tuning the rate of compression. Their framework is depicted below.

research about image processing

The authors obtained superior results compared to popular methods like JPEG, both by reducing the bits per pixel and in reconstruction quality. An example of this is shown below.

research about image processing

Task 6: Image Manipulation

Image manipulation is the process of altering an image to change its appearance. This may be desired for several reasons, such as removing an unwanted object from an image or adding an object that is not present in the image. Graphic designers often do this to create posters, films, etc.

An example of Image Manipulation is Neural Style Transfer , which is a technique that utilizes Deep Learning models to adapt an image to the style of another. For example, a regular image could be transferred to the style of “Starry Night” by van Gogh. Neural Style Transfer also enables AI to generate art .

research about image processing

An example of such a model is the one proposed in this paper that is able to transfer arbitrary new styles in real-time (other approaches often take much longer inference times) using an autoencoder-based framework. The authors proposed an adaptive instance normalization (AdaIN) layer that adjusts the mean and variance of the content input (the image that needs to be changed) to match those of the style input (image whose style is to be adopted). The AdaIN output is then decoded back to the image space to get the final style transferred image. An overview of the framework is shown below.

research about image processing

Examples of images transferred to other artistic styles are shown below and compared to existing state-of-the-art methods.

research about image processing

Task 7: Image Generation

Synthesis of new images is another important task in image processing, especially in Deep Learning algorithms which require large quantities of labeled data to train. Image generation methods typically use Generative Adversarial Networks (GANs) which is another unique neural network architecture .

research about image processing

GANs consist of two separate models: the generator, which generates the synthetic images, and the discriminator, which tries to distinguish synthetic images from real images. The generator tries to synthesize images that look realistic to fool the discriminator, and the discriminator trains to better critique whether an image is synthetic or real. This adversarial game allows the generator to produce photo-realistic images after several iterations, which can then be used to train other Deep Learning models.

Task 8: Image-to-Image Translation

Image-to-Image translation is a class of vision and graphics problems where the goal is to learn the mapping between an input image and an output image using a training set of aligned image pairs. For example, a free-hand sketch can be drawn as an input to get a realistic image of the object depicted in the sketch as the output, as shown below.

research about image processing

‍ Pix2pix is a popular model in this domain that uses a conditional GAN (cGAN) model for general purpose image-to-image translation, i.e., several problems in image processing like semantic segmentation, sketch-to-image translation, and colorizing images, are all solved by the same network. cGANs involve the conditional generation of images by a generator model. For example, image generation can be conditioned on a class label to generate images specific to that class.

research about image processing

Pix2pix consists of a U-Net generator network and a PatchGAN discriminator network, which takes in NxN patches of an image to predict whether it is real or fake, unlike traditional GAN models. The authors argue that such a discriminator enforces more constraints that encourage sharp high-frequency detail. Examples of results obtained by the pix2pix model on image-to-map and map-to-image tasks are shown below.

research about image processing

Key Takeaways

The information technology era we live in has made visual data widely available. However, a lot of processing is required for them to be transferred over the internet or for purposes like information extraction, predictive modeling, etc.

The advancement of deep learning technology gave rise to CNN models, which were specifically designed for processing images. Since then, several advanced models have been developed that cater to specific tasks in the Image Processing niche. We looked at some of the most critical techniques in Image Processing and popular Deep Learning-based methods that address these problems, from image compression and enhancement to image synthesis.

Recent research is focused on reducing the need for ground truth labels for complex tasks like object detection, semantic segmentation, etc., by employing concepts like Semi-Supervised Learning and Self-Supervised Learning , which makes models more suitable for broad practical applications.

If you’re interested in learning more about computer vision, deep learning, and neural networks, have a look at these articles:

  • Deep Learning 101: Introduction [Pros, Cons & Uses]
  • What Is Computer Vision? [Basic Tasks & Techniques]
  • Convolutional Neural Networks: Architectures, Types & Examples

research about image processing

Rohit Kundu is a Ph.D. student in the Electrical and Computer Engineering department of the University of California, Riverside. He is a researcher in the Vision-Language domain of AI and published several papers in top-tier conferences and notable peer-reviewed journals.

“Collecting user feedback and using human-in-the-loop methods for quality control are crucial for improving Al models over time and ensuring their reliability and safety. Capturing data on the inputs, outputs, user actions, and corrections can help filter and refine the dataset for fine-tuning and developing secure ML solutions.”

research about image processing

Related articles

YOLO: Algorithm for Object Detection Explained [+Examples]

  • Frontiers in Robotics and AI
  • Robot Vision and Artificial Perception
  • Research Topics

Current Trends in Image Processing and Pattern Recognition

Total Downloads

Total Views and Downloads

About this Research Topic

Special thanks to Dr. Mahendra Dashrath Shirsat , who was instrumental in the organization of this Research Topic. Recent technological advancements in computer science have opened diversified opportunities to researchers working in ...

Keywords : Computer Vision, Pattern Recognition, Artificial Intelligence

Important Note : All contributions to this Research Topic must be within the scope of the section and journal to which they are submitted, as defined in their mission statements. Frontiers reserves the right to guide an out-of-scope manuscript to a more suitable section or journal at any stage of peer review.

Topic Editors

Topic coordinators, recent articles, submission deadlines.

Submission closed.

Participating Journals

Total views.

  • Demographics

No records found

total views article views downloads topic views

Top countries

Top referring sites, about frontiers research topics.

With their unique mixes of varied contributions from Original Research to Review Articles, Research Topics unify the most influential researchers, the latest key findings and historical advances in a hot research area! Find out more on how to host your own Frontiers Research Topic or contribute to one as an author.

U.S. flag

An official website of the United States government

The .gov means it’s official. Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

The site is secure. The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

  • Publications
  • Account settings

Preview improvements coming to the PMC website in October 2024. Learn More or Try it out now .

  • Advanced Search
  • Journal List
  • Bentham Open Access

Logo of benthamopen

Viewpoints on Medical Image Processing: From Science to Application

Thomas m. deserno (né lehmann).

1 Department of Medical Informatics, Uniklinik RWTH Aachen, Germany;

Heinz Handels

2 Institute of Medical Informatics, University of Lübeck, Germany;

Klaus H. Maier-Hein (né Fritzsche)

3 Medical and Biological Informatics, German Cancer Research Center, Heidelberg, Germany;

Sven Mersmann

4 Medical and Biological Informatics, Junior Group Computer-assisted Interventions, German Cancer Research Center, Heidelberg, Germany;

Christoph Palm

5 Regensburg – Medical Image Computing (Re-MIC), Faculty of Computer Science and Mathematics, Regensburg University of Applied Sciences, Regensburg, Germany;

Thomas Tolxdorff

6 Institute of Medical Informatics, Charité - Universitätsmedizin Berlin, Germany;

Gudrun Wagenknecht

7 Electronic Systems (ZEA-2), Central Institute of Engineering, Electronics and Analytics, Forschungszentrum Jülich GmbH, Germany;

Thomas Wittenberg

8 Image Processing & Biomedical Engineering Department, Fraunhofer Institute for Integrated Circuits IIS, Erlangen, Germany

Medical image processing provides core innovation for medical imaging. This paper is focused on recent developments from science to applications analyzing the past fifteen years of history of the proceedings of the German annual meeting on medical image processing (BVM). Furthermore, some members of the program committee present their personal points of views: (i) multi-modality for imaging and diagnosis, (ii) analysis of diffusion-weighted imaging, (iii) model-based image analysis, (iv) registration of section images, (v) from images to information in digital endoscopy, and (vi) virtual reality and robotics. Medical imaging and medical image computing is seen as field of rapid development with clear trends to integrated applications in diagnostics, treatment planning and treatment.

1.  INTRODUCTION

Current advances in medical imaging are made in fields such as instrumentation, diagnostics, and therapeutic applications and most of them are based on imaging technology and image processing. In fact, medical image processing has been established as a core field of innovation in modern health care [ 1 ] combining medical informatics, neuro-informatics and bioinformatics [ 2 ].

In 1984, the Society of Photo-Optical Instrumentation Engineers (SPIE) has launched a multi-track conference on medical imaging, which still is considered as the core event for innovation in the field [Methods]. Analogously in Germany, the workshop “Bildverarbeitung für die Medizin (BVM)” (Image Processing for Medicine) has recently celebrated its 20 th annual performance. The meeting has evolved over the years to a multi-track conference on international standard [ 3 , 4 , 5 , 6 , 7 , 8 , 9 ].

Nonetheless, it is hard to name the most important and innovative trends within this broad field ranging from image acquisition using novel imaging modalities to information extraction in diagnostics and treatment. Ritter et al. recently emphasized on the following aspects: (i) enhancement, (ii) segmentation, (iii) registration, (iv) quantification, (v) visualization, and (vi) computer-aided detection (CAD) [ 10 ].

Another concept of structuring is here referred to as the “from-to” approach. For instance,

  • From nano to macro : Co-founded in 2002 by Michael Unser of EPFL, Switzerland, The Institute of Electrical and Electronics Engineers (IEEE) has launched an international symposium on biomedical imaging (ISBI). This conference is focused in the motto from nano to macro covering all aspects of medical imaging from sub-cellular to the organ level.
  • From production to sharing : Another “from-to” migration is seen in the shift from acquisition to communication [ 11 ]. Clark et al. expected advances in the medical imaging fields along the following four axes: (i) image production and new modalities; (ii) image processing, visualization, and system simulation; (iii) image management and retrieval; and (iv) image communication and telemedicine.
  • From kilobyte to terabyte : Deserno et al. identified another “from-to” migration, which is seen in the amount of data that is produced by medical imagery [ 12 ]. Today, High-resolution CT reconstructs images with 8000 x 8000 pixels per slice with 0.7 μm isotropic detail detectability, and whole body scans with this resolution reach several Gigabytes (GB) of data load. Also, microscopic whole-slide scanning systems can easily provide so-called virtual slices in the rage of 30.000 x 50.000 pixels, which equals 16.8 GB on 10 bit gray scale.
  • From science to application : Finally, in this paper, we aim at analyzing recent advantages in medical imaging on another level. The focus is to identify core fields fostering transfer of algorithms into clinical use and addressing gaps still remaining to be bridged in future research.

The remainder of this review is organized as follows. In Section 3, we briefly analyze the history of the German workshop BVM. More than 15 years of proceedings are currently available and statistics is applied to identify trends in content of conference papers. Section 4 then provides personal viewpoints to challenging and pioneering fields. The results are discussed in Section 5.

2.  THE GERMAN HISTORY FROM SCIENCE TO APPLICATION

Since 1994, annual proceedings of the presented contributions from the BVM workshops have been published, which are available electronically in postscript (PS) or the portable document format (PDF) from 1996. Disregarding the type of presentation (oral, poster, or software demonstration), the authors are allowed to submit papers with a length of up to five pages. In 2012 the length was increased to six pages. Both, English and German papers are allowed. The number of English contributions increased steadily over the years, and reached about 50% in 2008 [ 8 ].

In order to analyze the content of the on average 124k words long proceedings regarding the most relevant topics that were discussed on the BVM workshops, the incidence of the most frequent words has been assessed for each proceeding from 1996 until 2012. From this investigation, about 300 common words of the German and English language (e.g. and / und, etc.) have been excluded. (Fig. ​ 1 1 ) presents a word cloud computed from the 100 most frequent terms used in the proceedings of the 2012 BVM workshop. The font sizes of the words refer to their counted frequency in the text.

An external file that holds a picture, illustration, etc.
Object name is CMIR-9-79_F1.jpg

Word cloud representing the most frequent 100 terms counted from the 469 page long BVM proceedings 2012 [13].

It can be seen, in 2012, “image” was the most frequent word occurring in the BVM proceedings (920 incidences), as also observed in all the other years (1996-2012: 10,123 incidences). Together with terms like “reconstruction”, “analysis”, or “processing”, medical imaging is clearly recognizable as the major subject of the BVM workshops.

Concerning the scientific direction of the BVM meeting over time, terms such as “segmentation”, “registration”, and “navigation”, which indicate image processing procedures relevant for clinical applications, have been used with increasing frequencies (Fig. ​ 2 2 , left). The same holds for terms like “evaluation” or “experiment”, which are related to the validation of the contributions (Fig. ​ 2 2 , middle), constituting a first step towards the transition of the scientific results into a clinical application. (Fig. ​ 2 2 right) shows the occurrence of the words “patient” and “application” in the contributed papers of the BVM workshops between 1996 and 2012. Here, rather constant numbers of occurrences are found indicating a stringent focus on clinical applications.

An external file that holds a picture, illustration, etc.
Object name is CMIR-9-79_F2.jpg

Trends from BVM workshop proceedings from important terms of processing procedures (left), experimental verification (middle), and application to humans (right).

3.  VIEWPOINTS FROM SCIENCE TO APPLICATION

3.1. multi-modal image processing for imaging and diagnosis.

Multi-modal imaging refers to (i) different measurements at a single tomographic system (e.g., MRI and functional MRI), (ii) measurements at different tomographic systems (e.g., computed tomography (CT), positron emission tomography (PET), and single photon emission computed tomography (SPECT)), and (iii) measurements at integrated tomographic systems (PET/CT, PET/MR). Hence, multi-modal tomography has become increasingly popular in clinical and preclinical applications (Fig. ​ 3 3 ) providing images of morphology and function (Fig. ​ 4 4 ).

An external file that holds a picture, illustration, etc.
Object name is CMIR-9-79_F3.jpg

PubMed cited papers for search “multimodal AND (imaging OR tomography OR image)”.

An external file that holds a picture, illustration, etc.
Object name is CMIR-9-79_F4.jpg

Morphological and functional imaging in clinical and pre-clinical applications.

Multi-modal image processing for enhancing multi-modal imaging procedures primarily deals with image reconstruction and artifact reduction. Examples are the integration of additional information about tissue types from MRI as an anatomical prior to the iterative reconstruction of PET images [ 14 ] and the CT- or MR-based correction of attenuation artifacts in PET, respectively, which is an essential prerequisite for quantitative PET analysis [ 15 , 16 ]. Since these algorithms are part of the imaging workflow, only highly automated, fast, and robust algorithms providing adequate accuracy are appropriate solutions. Accordingly, the whole image in the different modalities must be considered.

This requirement differs for multi-modal diagnostic approaches. In most applications, a single organ or parts of an organ are of interest. Anatomical and particularly pathological regions often show a high variability due to structure, deformation, or movement, which is difficult to predict and is thus a great challenge for image processing. In multi-modality applications, images represent complementary information often obtained at different time-scales introducing additional complexity for algorithms. Other inequalities are introduced by the different resolutions and fields of view showing the organ of interest in different degrees of completeness. From a scientific and thus algorithmic point of view, image processing methods for multi-modal images must meet higher requirements than those applied to single-modality images.

Looking exemplarily at segmentation as one of the most complex and demanding problems in medical image processing, the modality showing anatomical and pathological structures in high resolution and contrast (e.g., MRI, CT) is typically used to segment the structure or volume of interest (VOI) to subsequently analyze other properties such as function within these target structures. Here, the different resolutions have to be regarded to correct for partial volume effects in the functional modality (e.g., PET, SPECT). Since the structures to be analyzed are dependent on the disease of the actual patient examined, automatic segmentation approaches are appropriate solutions if the anatomical structures of interest are known beforehand [ 17 ], while semi-automatic approaches are advantageous if flexibility is needed [ 18 , 19 ].

Transferring research into diagnostic application software requires a graphical user interface (GUI) to parameterize the algorithms, 2D and 3D visualization of multi-modal images and segmentation results, and tools to interact with the visualized images during the segmentation procedure. The Medical Interaction Toolkit [ 20 ] or the MevisLab [ 21 ] provide the developer with frameworks for multi-modal visualization, interaction and tools to build appropriate GUIs, yielding an interface to integrate new algorithms from science to application.

Another important aspect transferring algorithms from pure academics to clinical practice is evaluation. Phantoms can be used for evaluating specific properties of an algorithm, but not for evaluating the real situation with all its uncertainties and variability. Thus, the most important step of migrating is extensive testing of algorithms on large amounts of real clinical data, which is a great challenge particularly for multi-modal approaches, and should in future be more supported by publicly available databases.

3.2. Analysis of Diffusion Weighted Images

Due to its sensitivity to micro-structural changes in white matter, diffusion weighted imaging (DWI) is of particular interest to brain research. Stroke is the most common and well known clinical application of DWI, where the images allow the non-invasive detection of ischemia within minutes of onset and are sensitive and relatively specific in detecting changes triggered by strokes [ 22 ]. The technique has also allowed deeper insights into the pathogenesis of Alzheimer’s disease, Parkinson disease, autism spectrum disorder, schizophrenia, and many other psychiatric and non-psychiatric brain diseases. DWI is also applied in the imaging of (mild) traumatic brain injury, where conventional techniques lack sensitivity to detect the subtle changes occurring in the brain. Here, studies on sports-related traumata in the younger population have raised considerable debates in the recent past [ 23 ].

Methodologically, recent advances in the generation and analysis of large-scale networks on basis of DWI are particularly exciting and promise new dimensions in quantitative neuro-imaging via the application of the profound set of tools available in graph theory to brain image analysis [ 24 ]. DWI sheds light on the living brain network architecture, revealing the organization of fiber connections together with their development and change in disease.

Big challenges remain to be solved though: Despite many years of methodological development in DWI post-processing, the field still seems to be in its infancy. The reliable tractography-based reconstruction of known or pathological anatomy is still not solved. Current reconstruction challenges at the 2011 and 2012 annual meetings of the Medical Image Computing and Computer Assisted Intervention (MICCAI) Society have demonstrated the lack of methods that can reliably reconstruct large and well-known structures like the cortico-spinal tract in datasets of clinical quality [ 25 ]. Missing reference-based evaluation techniques hinder the well-founded demonstration of the real advantages of novel tractography algorithms over previous methods [ 26 ]. The mentioned limitations have obscured a broader application of DWI tractography, e.g. in surgical guidance. Even though the application of DWI e.g. in surgical resection has shown to facilitate the identification of risk structures [ 27 ], the widespread use of these techniques in surgical practice remains limited mainly by the lack of robust and standardized methods that can be applied multi-centered across institutions and comprehensive evaluation of these algorithms.

However, there are numerous applications of DWI in cancer imaging, which bridge imaging science and clinical application. The imaging modality has shown potential in the detection, staging and characterization of tumors (Fig. ​ 5 5 ), the evaluation of therapy response, or even in the prediction of therapy outcome [ 28 ]. DWI was also applied in the detection and characterization of lesions in the abdomen and the pelvis, where increased cellularity of malignant tissue leads to restricted diffusion when compared to the surrounding tissue [ 29 ]. The challenge here again will be the establishment of reliable sequences and post-processing methods for the wide-spread and multi-centric application of the techniques in the future.

An external file that holds a picture, illustration, etc.
Object name is CMIR-9-79_F5.jpg

Depiction of fiber tracts in the vicinity of a grade IV glioblastoma. The volumetric tracking result (yellow) was overlaid on an axial T2-FLAIR image. Red and green arrows indicate the necrotic tumor core and peritumoral hyperintensity, respectively. In the frontal parts, fiber tracts are still depicted, whereas in the dorsal part, tracts seem to be either displaced or destructed by the tumor.

3.3. Model-Based Image Analysis

As already emphasized in the previous viewpoints, there is a big gap between the state of the art in current research and methods available in clinical application, especially in the field of medical image analysis [ 30 ]. Segmentation of relevant image structures (tissues, tumors, vessels etc.) is still one of the key problems in medical image computing lacking robust and automatic methods. The application of pure data-driven approaches like thresholding, region growing, edge detection, or enhanced data-driven methods like watershed algorithms, Markov random field (MRF)-based approaches, or graph cuts often leads to weak segmentations due to low contrasts between neighboring image objects, image artifacts, noise, partial volume effects etc.

Model-based segmentation integrates a-priori knowledge of the shapes and appearance of relevant structures into the segmentation process. For example, the local shape of a vessel can be characterized by the vesselness operator [ 31 ], which generates images with an enhanced representation of vessels. Using the vesselness information in combination with the original grey value image segmentation of vessels can be improved significantly and especially the segmentation of a small vessel becomes possible (e.g. [ 32 ]).

In statistical or active shape and appearance models [ 33 , 34 ], shape variability in organ distribution among individuals and characteristic gray value distributions in the neighborhood of the organ can be represented. In these approaches, a set of segmented image data is used to train active shape and active appearance models, which include information about the mean shape and shape variations as well as characteristic gray value distributions and their variation in the population represented in the training data set. Instead of direct point-to-point correspondences that are used during the generation of classical statistical shape models, Hufnagel et al. have suggested probabilistic point-to-point correspondences [ 35 ]. This approach takes into account that often inaccuracies are unavoidable by the definition of direct point correspondences between organs of different persons. In probabilistic statistical shape models, these correspondence uncertainties are respected explicitly to improve the robustness and accuracy of shape modeling and model-based segmentation. Integrated in an energy minimizing level set framework, the probabilistic statistical shape models can be used for enhanced organ segmentation [ 36 ].

In contrast thereto, atlas-based segmentation methods (e.g., [ 37 ]) realize a case-based approach and make use of the segmentation information contained in a single segmented data set, which is transferred to an unseen patient image data set. The transfer of the atlas segmentation to the patient segmentation is done by inter-individual non-linear registration methods. Multi-atlas segmentation methods using several atlases have been proposed (e.g. [ 38 ]) and show an improved accuracy and robustness in comparison to single atlas segmentation methods. Hence, multi-atlas approaches are currently in the focus of further research [ 39 , 40 ].

In future, more task-oriented systems integrated into diagnostic processes, intervention planning, therapy and follow-up are needed. In the field of image analysis, due the limited time of the physicians, automatic procedures are of special interest to segment and extract quantitative object parameters in an accurate, reproducible and robust way. Furthermore, intelligent and easy-to-use methods for fast correction of unavoidable segmentation errors are needed.

3.4. Registration of Section Images

Imaging techniques such as histology [ 41 ] or auto-radiography [ 42 ] are based on thin post-mortem sections. In comparison to in-vivo imaging, e.g. positron emission tomography (PET), magnetic resonance imaging (MRI), or DWI (as addressed in the previous viewpoint, cf. Section 4.1), several properties are considered advantageous. For instance, tissue can be processed after sectioning to enhance contrast (e.g. staining) [ 43 ], to mark specific properties like receptors [ 44 ] or to apply laser ablation studying the spatial element distribution [ 45 ]; tissue can be scanned in high-resolution [ 43 ]; and tissue is thin enough to allow optical light transmission imaging, e.g. polarized light imaging (PLI) [ 46 ]. Therefore, section imaging results in high space-resolved and high-contrasted data, which supports findings such as cytoarchitectonic boundaries [ 47 ], neuronal fiber directions [ 48 ], and receptor or element distributions [ 45 ].

Restacking of 2D sections into a 3D volume followed by the fusion of this stack with an in-vivo volume is the challenging task of medical image processing on the track from science to application. The 3D section stacks then serve as an atlas for a large variety of applications. Sections are non-linearly deformed during cutting and post-processing. Additionally, discontinuous artifacts like tears or enrolled tissue hamper the correspondence of true structure and tissue imaged.

The so-called “problem of the digitized banana” [ 41 ] prohibits the section-by-section registration without 3D reference. Smoothness of registered stacks is not equivalent to consistency and correctness. Whereas the deformations are section-specific, the orientation of the sections in comparison to the 3D structure depends on the cutting direction and, thus, is the same for all sections. In this tangled situation the question rises, if it is better to (i) restack the sections first, register the whole stack afterwards and correct for deformations at last (volume-first approach) or (ii) to register each section individually to the 3D reference volume while correcting deformations at the same time (section-first approach). Both approaches combine

  • Multi-modal registration : The need of a 3D reference and the application to correlate high-resolution section imaging findings with in-vivo imaging are sometimes solved at the same time. If possible, the 3D in-vivo modality itself is used as a reference.

An external file that holds a picture, illustration, etc.
Object name is CMIR-9-79_F6.jpg

Characteristic flow chart of volume-first approach and volume generation with (gray boxes) or without blockface images as intermediate reference modality (Column I). Either the in-vivo volume is post-processed to generate a pseudo-high-resolution volume with propagated section gaps (Column II) or the section volume is post-processed to get a low-resolution stack with filled gaps (Column III) [42].

Due to the variety of difficulties, missing evaluation possibilities and section specifics like post-processing, embedding, cutting procedure and tissue type there is not just one best approach to come from 2D to 3D. But careful work in this field is paid off by cutting edge applications. Not least within the European flagship, The Human Brain Project (HBP), further research in this area of medical image processing is demanded. The state-of-the-art review of HBP states in the context of human brain mapping: “What is missing to date is an integrated open source tool providing a standard application programming interface (API) for data registration and coordinate transformations and guaranteeing multi-scale and multi-modal data accuracy” [ 49 ]. Such a tool will narrow the gap from science to application.

3.5. From Images to Information in Digital Endoscopy

Basic endoscopic technologies and their routine applications (Fig. ​ 7 7 , bottom layers) still are purely data-oriented, as the complete image analysis and interpretation is performed solely by the physician. If content of endoscopic imagery is analyzed automatically, several new application scenarios for diagnostics and intervention with increasing complexity can be identified (Fig. ​ 7 7 , upper layers). As these new possibilities of endoscopy are inherently coupled with the use of computers, these new endoscopic methods and applications can be referred to as computer-integrated endoscopy [ 50 ]. Information, however, is referred to on the highest of the five levels of semantics (Fig. ​ 7 7 ):

An external file that holds a picture, illustration, etc.
Object name is CMIR-9-79_F7.jpg

Modules to build computer-integrated endoscopy, which enables information gain from image data.

  • 1. Acquisition : Advancements in diagnostic endoscopy were obtained by glass fibers for the transmission of electric light into and image information out of the body. Besides the pure wire-bound transmission of endoscopic imagery, in the past 10 years wireless broadcast came available for gastroscopic video data captured from capsule endoscopes [ 51 ].
  • 2. Transportation : Based on digital technologies, essential basic processes of endoscopic still image and image sequence capturing, storage, archiving, documentation, annotation and transmission have been simplified. These developments have initially led to the possibilities for tele-diagnosis and tele-consultations in diagnostic endoscopy, where the image data is shared using local networks or the internet [ 52 ].
  • 3. Enhancement : Methods and applications for image enhancement include intelligent removal of honey-comb patterns in fiberscopic recordings [ 53 ], temporal filtering for the reduction of ablation smoke and moving particles [ 54 ], image rectification for gastroscopes. Additionally, besides having an increased complexity, they have to work in real time with a maximum delay of 60 milliseconds, to be acceptable for surgeons and physicians.
  • 4. Augmentation : Image processing enhances endoscopic views with additional type of information. Examples of this type are artificial working horizon, key-hole views to endoscopic panorama-images [ 55 ], 3D surfaces computed from point clouds obtained by special endoscopic imaging devices such as stereo endoscopes [ 56 ], time-of-flight endoscopes [ 57 ], or shape-from polarization approaches [ 58 ]. This level also includes the possibilities of visualization and image fusion of endoscopic views with preoperative acquired radiological imagery such as angiography or CT data [ 59 ] for better intra-operative orientation and navigation, as well as image-based tracking and navigation through tubular structures [ 60 ].
  • 5. Content : Methods of content-based image analysis consider the automated segmentation, characterization and classification of diagnostic image content. Such methods describe computer-assisted detection (CADe) [ 61 ] of lesions (such as e.g. polyps) or computer-assisted diagnostics (CADx) [ 62 ], where already detected and delineated regions are characterized and classified into, for instance, benign or malign tissue areas. Furthermore, such methods automatically identify and track surgical instruments, e.g. supporting robotic surgery approaches.

On the technical side the semantics of the extracted image contents increases from the pure image recording up to the image content analysis level. This complexity also relates to the expected time axis needed to bring these methods from science to clinical applications.

From the clinical side, the most complex methods such as automated polyp detection (CADe) are considered as most important. However, it is expected that computer-integrated endoscopy systems will increasingly enter clinical applications and as such will contribute to the quality of the patient’s healthcare.

3.6. Virtual Reality and Robotics

Virtual reality (VR) and robotics are two rapidly expanding fields with growing application in surgery. VR creates three-dimensional environments increasing the capability for sensory immersion, which provides the sensation of being present in the virtual space. Applications of VR include surgical planning, case rehearsal, and case playback, which could change the paradigm of surgical training, which is especially necessary as the regulations surrounding residencies continue to change [ 63 ]. Surgeons are enabled to practice in controlled situations with preset variables to gain experience in a wide variety of surgical scenarios [ 64 ].

With the availability of inexpensive computational power and the need for cost-effective solutions in healthcare, medical technology products are being commercialized at an increasingly rapid pace. VR is already incorporated into several emerging products for medical education, radiology, surgical planning and procedures, physical rehabilitation, disability solutions, and mental health [ 65 ]. For example, VR is helping surgeons learn invasive techniques before operating, and allowing physicians to conduct real-time remote diagnosis and treatment. Other applications of VR include the modeling of molecular structures in three dimensions as well as aiding in genetic mapping and drug synthesis.

In addition, the contribution of robotics has accelerated the replacement of many open surgical treatments with more efficient minimally invasive surgical techniques using 3D visualization techniques. Robotics provides mechanical assistance with surgical tasks, contributing greater precision and accuracy and allowing automation. Robots contain features that can augment surgical performance, for instance, by steadying a surgeon’s hand or scaling the surgeon’s hand motions [ 66 ]. Current robots work in tandem with human operators to combine the advantages of human thinking with the capabilities of robots to provide data, to optimize localization on a moving subject, to operate in difficult positions, or to perform without muscle fatigue. Surgical robots require spatial orientation between the robotic manipulators and the human operator, which can be provided by VR environments that re-create the surgical space. This enables surgeons to perform with the advantage of mechanical assistance but without being alienated from the sights, sounds, and touch of surgery [ 67 ].

After many years of research and development, Japanese scientists recently presented an autonomous robot which is able to realize surgery within the human body [ 68 ]. They send a miniature robot inside the patient’s body, perceive what the robot saw and touched before conducting surgery by using the robot’s minute arms as though as it were the one’s of the surgeon.

While the possibilities – and the need – for medical VR and robotics are immense, approaches and solutions using new applications require diligent, cooperative efforts among technology developers, medical practitioners and medical consumers to establish where future requirements and demand will lie. Augmented and virtual reality substituting or enhancing the reality can be considered as multi-reality approaches [ 69 ], which are already available in commercial products for clinical applications.

4.  DISCUSSION

In this paper, we have analyzed the written proceedings of the German annual meeting on Medical Imaging (BVM) and presented personal viewpoints on medical image processing focusing on the transfer from science to application. Reflecting successful clinical applications and promising technologies that have been recently developed, it turned out that medical image computing has transferred from single- to multi-images, and there are several ways to combine these images:

  • Multi-modality : Figs. ​ 2 2 and ​ 3 3 have emphasized that medical image processing has been moved away from the simple 2D radiograph via 3D imaging modalities to multi-modal processing and analyzing. Successful applications that are transferrable into the clinics jointly process imagery from different modalities.
  • Multi-resolution : Here, images with different properties from the same subject and body area need alignment and comparison. Usually, this implies a multi-resolution approach, since different modalities work on different scales of resolutions.
  • Multi-scale : If data becomes large, as pointed out for digital pathology, algorithms must operate on different scales, iteratively refining the alignment from coarse-to-fine. Such algorithmic design usually is referred to as multi-scale approach.
  • Multi-subject : Models have been identified as key issue for implementing applicable image computing. Such models are used for segmentation, content understanding, and intervention planning. They are generated from a reliable set of references, usually based on several subjects.
  • Multi-atlas : Even more complex, the personal viewpoints have identified multi-atlas approaches that are nowadays addressed in research. For instance in segmentation, accuracy and robustness of algorithms are improved if they are based on multiple rather than a single atlas. Both, accuracy and robustness are essential requirements for transferring algorithms into the clinical use.
  • Multi-semantics : Based on the example of digital endoscopy, another “multi” term is introduced. Image understanding and interpretation has been defined on several levels of semantics, and successful applications in computer-integrated endoscopy are operating on several of such levels.
  • Multi-reality : Finally, our last viewpoint has addressed the augmentation of the physician’s view by means of virtual reality. Medical image computing is applied to generate and superimpose such views, which results in a multi-reality world.

Andriole, Barish, and Khorasani also have discussed issues to consider for advanced image processing in the clinical arena [ 70 ]. In completion of the collection of “multi” issues, they emphasized that radiology practices are experiencing a tremendous increase in the number of images associated with each imaging study, due to multi-slice , multi-plane and/or multi-detector 3D imaging equipment. Computer-aided detection used as a second reader or as a first-pass screener will help maintaining or perhaps improving readers' performance on such big data in terms of sensitivity and specificity.

Last not least, with all these “multies”, the computational load of algorithms again becomes an issue. Modern computers provide enormous computational power and yield a revisiting and applications of several “old” approaches, which did not find their way into the clinical use yet, just because of the processing times. However, combining many images of large sizes, processing time becomes crucial again. Scholl et al. have recently addressed this issue reviewing applications based on parallel processing and usage of graphical processors for image analysis [ 12 ]. These are seen as multi-processing methods.

In summary, medical image processing is a progressive field of research, and more and more applications are becoming part of the clinical practice. These applications are based on one or more of the “multi” concepts that we have addressed in this review. However, effects from current trends in the Medical Device Directives that increase the efforts needed for clinical trials of new medical imaging procedure, cannot be observed until today. It will hence be an interesting point to follow the trend of the translation of scientific results of future BVM workshops into clinical applications.

ACKNOWLEDGEMENTS

We would like to thank Hans-Peter Meinzer, Co-Chair of the German BVM, for his helpful suggestions and for encouraging his research fellows to contribute and hence, giving this paper a “ multi-generation ” view.

CONFLICT OF INTEREST

The author(s) confirm that this article content has no conflict of interest.

Thank you for visiting nature.com. You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

  • View all journals
  • Explore content
  • About the journal
  • Publish with us
  • Sign up for alerts

Image processing articles within Nature Methods

Correspondence | 10 June 2024

Omega — harnessing the power of large language models for bioimage analysis

  • Loïc A. Royer

Correspondence | 17 May 2024

DL4MicEverywhere: deep learning for microscopy made flexible, shareable and reproducible

  • Iván Hidalgo-Cenalmor
  • , Joanna W. Pylvänäinen
  •  &  Estibaliz Gómez-de-Mariscal

Article | 12 April 2024

Pretraining a foundation model for generalizable fluorescence microscopy-based image restoration

A pretrained foundation model (UniFMIR) enables versatile and generalizable performance across diverse fluorescence microscopy image reconstruction tasks.

  • , Weimin Tan
  •  &  Bo Yan

Resource 09 April 2024 | Open Access

Three million images and morphological profiles of cells treated with matched chemical and genetic perturbations

The CPJUMP1 Resource comprises Cell Painting images and profiles of 75 million cells treated with hundreds of chemical and genetic perturbations. The dataset enables exploration of their relationships and lays the foundation for the development of advanced methods to match perturbations.

  • Srinivas Niranj Chandrasekaran
  • , Beth A. Cimini
  •  &  Anne E. Carpenter

Research Briefing | 01 April 2024

Creating a universal cell segmentation algorithm

Cell segmentation currently involves the use of various bespoke algorithms designed for specific cell types, tissues, staining methods and microscopy technologies. We present a universal algorithm that can segment all kinds of microscopy images and cell types across diverse imaging protocols.

Analysis | 26 March 2024

The multimodality cell segmentation challenge: toward universal solutions

Cell segmentation is crucial in many image analysis pipelines. This analysis compares many tools on a multimodal cell segmentation benchmark. A Transformer-based model performed best in terms of performance and general applicability.

  • , Ronald Xie
  •  &  Bo Wang

Editorial | 12 February 2024

Where imaging and metrics meet

When it comes to bioimaging and image analysis, details matter. Papers in this issue offer guidance for improved robustness and reproducibility.

Correspondence | 24 January 2024

EfficientBioAI: making bioimaging AI models efficient in energy and latency

  • , Jiajun Cao
  •  &  Jianxu Chen

Correspondence | 08 January 2024

JDLL: a library to run deep learning models on Java bioimage informatics platforms

  • Carlos García López de Haro
  • , Stéphane Dallongeville
  •  &  Jean-Christophe Olivo-Marin

Article 08 January 2024 | Open Access

Unsupervised and supervised discovery of tissue cellular neighborhoods from cell phenotypes

CytoCommunity enables both supervised and unsupervised analyses of spatial omics data in order to identify complex tissue cellular neighborhoods based on cell phenotypes and spatial distributions.

  • , Jiazhen Rong
  •  &  Kai Tan

Article 04 January 2024 | Open Access

Image restoration of degraded time-lapse microscopy data mediated by near-infrared imaging

InfraRed-mediated Image Restoration (IR 2 ) uses deep learning to combine the benefits of deep-tissue imaging with NIR probes and the convenience of imaging with GFP for improved time-lapse imaging of embryogenesis.

  • Nicola Gritti
  • , Rory M. Power
  •  &  Jan Huisken

Method to Watch | 06 December 2023

Imaging across scales

New twists on established methods and multimodal imaging are poised to bridge gaps between cellular and organismal imaging.

  • Rita Strack

Visual proteomics

Advances will enable proteome-scale structure determination in cells.

Article 06 December 2023 | Open Access

Embryo mechanics cartography: inference of 3D force atlases from fluorescence microscopy

Foambryo is an analysis pipeline for three-dimensional force-inference measurements in developing embryos.

  • Sacha Ichbiah
  • , Fabrice Delbary
  •  &  Hervé Turlier

Article | 06 December 2023

TubULAR: tracking in toto deformations of dynamic tissues via constrained maps

TubULAR is an in toto tissue cartography method for mapping complex dynamic surfaces

  • Noah P. Mitchell
  •  &  Dillon J. Cislo

Research Briefing | 05 December 2023

Inferring how animals deform improves cell tracking

Tracking cells is a time-consuming part of biological image analysis, and traditional manual annotation methods are prohibitively laborious for tracking neurons in the deforming and moving Caenorhabditis elegans brain. By leveraging machine learning to develop a ‘targeted augmentation’ method, we substantially reduced the number of labeled images required for tracking.

Article | 05 December 2023

Automated neuron tracking inside moving and deforming C. elegans using deep learning and targeted augmentation

Targettrack is a deep-learning-based pipeline for automatic tracking of neurons within freely moving C. elegans . Using targeted augmentation, the pipeline has a reduced need for manually annotated training data.

  • Core Francisco Park
  • , Mahsa Barzegar-Keshteli
  •  &  Sahand Jamal Rahi

Brief Communication | 16 November 2023

Improving resolution and resolvability of single-particle cryoEM structures using Gaussian mixture models

This manuscript describes a refinement protocol that extends the e2gmm method to optimize both the orientation and conformation estimation of particles to improve the alignment for flexible domains of proteins.

  • Muyuan Chen
  • , Michael F. Schmid
  •  &  Wah Chiu

Article 13 November 2023 | Open Access

Bio-friendly long-term subcellular dynamic recording by self-supervised image enhancement microscopy

DeepSeMi is a self-supervised denoising framework that can enhance SNR over 12 dB across diverse samples and imaging modalities. DeepSeMi enables extended longitudinal imaging of subcellular dynamics with high spatiotemporal resolution.

  • Guoxun Zhang
  • , Xiaopeng Li
  •  &  Qionghai Dai

High-fidelity 3D live-cell nanoscopy through data-driven enhanced super-resolution radial fluctuation

Enhanced super-resolution radial fluctuations (eSRRF) offers improved image fidelity and resolution compared to the popular SRRF method and further enables volumetric live-cell super-resolution imaging at high speeds.

  • Romain F. Laine
  • , Hannah S. Heil
  •  &  Ricardo Henriques

Article 26 October 2023 | Open Access

nextPYP: a comprehensive and scalable platform for characterizing protein variability in situ using single-particle cryo-electron tomography

nextPYP is a turn-key framework for single-particle cryo-electron tomography that streamlines complex data analysis pipelines, from pre-processing of tilt series to high-resolution refinement, for efficient analysis and visualization of large datasets.

  • Hsuan-Fu Liu
  •  &  Alberto Bartesaghi

Article | 07 September 2023

FIOLA: an accelerated pipeline for fluorescence imaging online analysis

FIOLA is a pipeline for processing calcium or voltage imaging data. Its advantages include the fast speed and online processing.

  • Changjia Cai
  • , Cynthia Dong
  •  &  Andrea Giovannucci

Correspondence | 18 August 2023

napari-imagej: ImageJ ecosystem access from napari

  • Gabriel J. Selzer
  • , Curtis T. Rueden
  •  &  Kevin W. Eliceiri

Article 17 August 2023 | Open Access

Alignment of spatial genomics data using deep Gaussian processes

Gaussian Process Spatial Alignment (GPSA) aligns multiple spatially resolved genomics and histology datasets and improves downstream analysis.

  • Andrew Jones
  • , F. William Townes
  •  &  Barbara E. Engelhardt

Brief Communication 27 July 2023 | Open Access

Segmentation metric misinterpretations in bioimage analysis

This study shows the importance of proper metrics for comparing algorithms for bioimage segmentation and object detection by exploring the impact of metrics on the relative performance of algorithms in three image analysis competitions.

  • Dominik Hirling
  • , Ervin Tasnadi
  •  &  Peter Horvath

Article | 27 July 2023

DBlink: dynamic localization microscopy in super spatiotemporal resolution via deep learning

DBlink uses deep learning to capture long-term dependencies between different frames in single-molecule localization microscopy data, yielding super spatiotemporal resolution videos of fast dynamic processes in living cells.

  • , Onit Alalouf
  •  &  Yoav Shechtman

Editorial | 11 July 2023

What’s next for bioimage analysis?

Advanced bioimage analysis tools are poised to disrupt the way in which microscopy images are acquired and analyzed. This Focus issue shares the hopes and opinions of experts on the near and distant future of image analysis.

Comment | 11 July 2023

The future of bioimage analysis: a dialog between mind and machine

The field of bioimage analysis is poised for a major transformation, owing to advancements in imaging technologies and artificial intelligence. The emergence of multimodal foundation models — which are akin to large language models (such as ChatGPT) but are capable of comprehending and processing biological images — holds great potential for ushering in a revolutionary era in bioimage analysis.

Unveiling the vision: exploring the potential of image analysis in Africa

Here we discuss the prospects of bioimage analysis in the context of the African research landscape as well as challenges faced in the development of bioimage analysis in countries on the continent. We also speculate about potential approaches and areas of focus to overcome these challenges and thus build the communities, infrastructure and initiatives that are required to grow image analysis in African research.

  • Mai Atef Rahmoon
  • , Gizeaddis Lamesgin Simegn
  •  &  Michael A. Reiche

The Twenty Questions of bioimage object analysis

The language used by microscopists who wish to find and measure objects in an image often differs in critical ways from that used by computer scientists who create tools to help them do this, making communication hard across disciplines. This work proposes a set of standardized questions that can guide analyses and shows how it can improve the future of bioimage analysis as a whole by making image analysis workflows and tools more FAIR (findable, accessible, interoperable and reusable).

  • Beth A. Cimini

Smart microscopes of the future

We dream of a future where light microscopes have new capabilities: language-guided image acquisition, automatic image analysis based on extensive prior training from biologist experts, and language-guided image analysis for custom analyses. Most capabilities have reached the proof-of-principle stage, but implementation would be accelerated by efforts to gather appropriate training sets and make user-friendly interfaces.

  • Anne E. Carpenter

Using AI in bioimage analysis to elevate the rate of scientific discovery as a community

The future of bioimage analysis is increasingly defined by the development and use of tools that rely on deep learning and artificial intelligence (AI). For this trend to continue in a way most useful for stimulating scientific progress, it will require our multidisciplinary community to work together, establish FAIR (findable, accessible, interoperable and reusable) data sharing and deliver usable and reproducible analytical tools.

  • Damian Dalle Nogare
  • , Matthew Hartley
  •  &  Florian Jug

Scaling biological discovery at the interface of deep learning and cellular imaging

Concurrent advances in imaging technologies and deep learning have transformed the nature and scale of data that can now be collected with imaging. Here we discuss the progress that has been made and outline potential research directions at the intersection of deep learning and imaging-based measurements of living systems.

  • Morgan Schwartz
  • , Uriah Israel
  •  &  David Van Valen

Towards effective adoption of novel image analysis methods

The bridging of domains such as deep learning-driven image analysis and biology brings exciting promises of previously impossible discoveries as well as perils of misinterpretation and misapplication. We encourage continual communication between method developers and application scientists that emphases likely pitfalls and provides validation tools in conjunction with new techniques.

  • Talley Lambert
  •  &  Jennifer Waters

Towards foundation models of biological image segmentation

In the ever-evolving landscape of biological imaging technology, it is crucial to develop foundation models capable of adapting to various imaging modalities and tackling complex segmentation tasks.

When seeing is not believing: application-appropriate validation matters for quantitative bioimage analysis

A key step toward biologically interpretable analysis of microscopy image-based assays is rigorous quantitative validation with metrics appropriate for the particular application in use. Here we describe this challenge for both classical and modern deep learning-based image analysis approaches and discuss possible solutions for automating and streamlining the validation process in the next five to ten years.

  • Jianxu Chen
  • , Matheus P. Viana
  •  &  Susanne M. Rafelski

Article | 10 July 2023

SCS: cell segmentation for high-resolution spatial transcriptomics

Subcellular spatial transcriptomics cell segmentation (SCS) combines information from stained images and sequencing data to improve cell segmentation in high-resolution spatial transcriptomics data.

  • , Dongshunyi Li
  •  &  Ziv Bar-Joseph

Research Highlight | 09 June 2023

Capturing hyperspectral images

A single-shot hyperspectral phasor camera (SHy-Cam) enables fast, multiplexed volumetric imaging.

Correspondence | 05 June 2023

Distributed-Something: scripts to leverage AWS storage and computing for distributed workflows at scale

  • Erin Weisbart
  •  &  Beth A. Cimini

Brief Communication | 29 May 2023

New measures of anisotropy of cryo-EM maps

This paper proposes two new anisotropy metrics—the Fourier shell occupancy and the Bingham test—that can be used to understand the quality of cryogenic electron microscopy maps.

  • Jose-Luis Vilas
  •  &  Hemant D. Tagare

Analysis 18 May 2023 | Open Access

The Cell Tracking Challenge: 10 years of objective benchmarking

This updated analysis of the Cell Tracking Challenge explores how algorithms for cell segmentation and tracking in both 2D and 3D have advanced in recent years, pointing users to high-performing tools and developers to open challenges.

  • Martin Maška
  • , Vladimír Ulman
  •  &  Carlos Ortiz-de-Solórzano

Article 15 May 2023 | Open Access

TomoTwin: generalized 3D localization of macromolecules in cryo-electron tomograms with structural data mining

TomoTwin is a deep metric learning-based particle picking method for cryo-electron tomograms. TomoTwin obviates the need for annotating training data and retraining a picking model for each protein.

  • , Thorsten Wagner
  •  &  Stefan Raunser

Research Briefing | 12 May 2023

Mapping the motion and structure of flexible proteins from cryo-EM data

A deep learning algorithm maps out the continuous conformational changes of flexible protein molecules from single-particle cryo-electron microscopy images, allowing the visualization of the conformational landscape of a protein with improved resolution of its moving parts.

Article 11 May 2023 | Open Access

Cross-modality supervised image restoration enables nanoscale tracking of synaptic plasticity in living mice

XTC is a supervised deep-learning-based image-restoration approach that is trained with images from different modalities and applied to an in vivo modality with no ground truth. XTC’s capabilities are demonstrated in synapse tracking in the mouse brain.

  • Yu Kang T. Xu
  • , Austin R. Graves
  •  &  Jeremias Sulam

3DFlex: determining structure and motion of flexible proteins from cryo-EM

3D Flexible Refinement (3DFlex) is a generative neural network model for continuous molecular heterogeneity for cryo-EM data that can be used to determine the structure and motion of flexible biomolecules. It enables visualization of nonrigid motion and improves 3D structure resolution by aggregating information from particle images spanning the conformational landscape of the target molecule.

  • Ali Punjani
  •  &  David J. Fleet

Resource 08 May 2023 | Open Access

EmbryoNet: using deep learning to link embryonic phenotypes to signaling pathways

EmbryoNet is an automated approach to the phenotyping of developing embryos that surpasses experts in terms of speed, accuracy and sensitivity. A large annotated image dataset of zebrafish, medaka and stickleback development rounds out this resource.

  • Daniel Čapek
  • , Matvey Safroshkin
  •  &  Patrick Müller

Article 01 April 2023 | Open Access

Rapid detection of neurons in widefield calcium imaging datasets after training with synthetic data

DeepWonder removes background signals from widefield calcium recordings and enables accurate and efficient neuronal segmentation with high throughput.

  • Yuanlong Zhang
  • , Guoxun Zhang

Correspondence | 10 February 2023

MoBIE: a Fiji plugin for sharing and exploration of multi-modal cloud-hosted big image data

  • Constantin Pape
  • , Kimberly Meechan
  •  &  Christian Tischer

Article 23 January 2023 | Open Access

Convolutional networks for supervised mining of molecular patterns within cellular context

DeePiCt (deep picker in context) is a versatile, open-source deep-learning framework for supervised segmentation and localization of subcellular organelles and biomolecular complexes in cryo-electron tomography.

  • Irene de Teresa-Trueba
  • , Sara K. Goetz
  •  &  Judith B. Zaugg

Correspondence | 10 January 2023

JIPipe: visual batch processing for ImageJ

  • Ruman Gerst
  • , Zoltán Cseresnyés
  •  &  Marc Thilo Figge

Advertisement

Browse broader subjects

  • Computational biology and bioinformatics

Quick links

  • Explore articles by subject
  • Guide to authors
  • Editorial policies

research about image processing

U.S. flag

An official website of the United States government

The .gov means it’s official. Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

The site is secure. The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

  • Publications
  • Account settings
  • My Bibliography
  • Collections
  • Citation manager

Save citation to file

Email citation, add to collections.

  • Create a new collection
  • Add to an existing collection

Add to My Bibliography

Your saved search, create a file for external citation management software, your rss feed.

  • Search in PubMed
  • Search in NLM Catalog
  • Add to Search

Application of Swarm Intelligence Optimization Algorithms in Image Processing: A Comprehensive Review of Analysis, Synthesis, and Optimization

Affiliations.

  • 1 School of Intelligent Manufacturing and Electronic Engineering, Wenzhou University of Technology, Wenzhou 325035, China.
  • 2 Intelligent Information Systems Institute, Wenzhou University, Wenzhou 325035, China.
  • PMID: 37366829
  • PMCID: PMC10296410
  • DOI: 10.3390/biomimetics8020235

Image processing technology has always been a hot and difficult topic in the field of artificial intelligence. With the rise and development of machine learning and deep learning methods, swarm intelligence algorithms have become a hot research direction, and combining image processing technology with swarm intelligence algorithms has become a new and effective improvement method. Swarm intelligence algorithm refers to an intelligent computing method formed by simulating the evolutionary laws, behavior characteristics, and thinking patterns of insects, birds, natural phenomena, and other biological populations. It has efficient and parallel global optimization capabilities and strong optimization performance. In this paper, the ant colony algorithm, particle swarm optimization algorithm, sparrow search algorithm, bat algorithm, thimble colony algorithm, and other swarm intelligent optimization algorithms are deeply studied. The model, features, improvement strategies, and application fields of the algorithm in image processing, such as image segmentation, image matching, image classification, image feature extraction, and image edge detection, are comprehensively reviewed. The theoretical research, improvement strategies, and application research of image processing are comprehensively analyzed and compared. Combined with the current literature, the improvement methods of the above algorithms and the comprehensive improvement and application of image processing technology are analyzed and summarized. The representative algorithms of the swarm intelligence algorithm combined with image segmentation technology are extracted for list analysis and summary. Then, the unified framework, common characteristics, different differences of the swarm intelligence algorithm are summarized, existing problems are raised, and finally, the future trend is projected.

Keywords: edge detection; image features; image processing; image segmentation; swarm intelligence optimization algorithm.

PubMed Disclaimer

Conflict of interest statement

The authors declare that they have no conflicts of interest.

The framework diagram of the…

The framework diagram of the swarm intelligence optimization algorithm.

Similar articles

  • Multilevel Multiobjective Particle Swarm Optimization Guided Superpixel Algorithm for Histopathology Image Detection and Segmentation. Kanadath A, Angel Arul Jothi J, Urolagin S. Kanadath A, et al. J Imaging. 2023 Mar 29;9(4):78. doi: 10.3390/jimaging9040078. J Imaging. 2023. PMID: 37103229 Free PMC article.
  • Gaussian Barebone Salp Swarm Algorithm with Stochastic Fractal Search for medical image segmentation: A COVID-19 case study. Zhang Q, Wang Z, Heidari AA, Gui W, Shao Q, Chen H, Zaguia A, Turabieh H, Chen M. Zhang Q, et al. Comput Biol Med. 2021 Dec;139:104941. doi: 10.1016/j.compbiomed.2021.104941. Epub 2021 Oct 19. Comput Biol Med. 2021. PMID: 34801864
  • A review of swarm intelligence algorithms deployment for scheduling and optimization in cloud computing environments. Qawqzeh Y, Alharbi MT, Jaradat A, Abdul Sattar KN. Qawqzeh Y, et al. PeerJ Comput Sci. 2021 Aug 25;7:e696. doi: 10.7717/peerj-cs.696. eCollection 2021. PeerJ Comput Sci. 2021. PMID: 34541313 Free PMC article.
  • A Survey of Using Swarm Intelligence Algorithms in IoT. Sun W, Tang M, Zhang L, Huo Z, Shu L. Sun W, et al. Sensors (Basel). 2020 Mar 5;20(5):1420. doi: 10.3390/s20051420. Sensors (Basel). 2020. PMID: 32150912 Free PMC article. Review.
  • Swarm intelligence metaheuristics for enhanced data analysis and optimization. Hanrahan G. Hanrahan G. Analyst. 2011 Sep 21;136(18):3587-94. doi: 10.1039/c1an15369b. Epub 2011 Aug 5. Analyst. 2011. PMID: 21818487 Review.
  • Percentile-Based Adaptive Immune Plasma Algorithm and Its Application to Engineering Optimization. Aslan S, Demirci S, Oktay T, Yesilbas E. Aslan S, et al. Biomimetics (Basel). 2023 Oct 14;8(6):486. doi: 10.3390/biomimetics8060486. Biomimetics (Basel). 2023. PMID: 37887617 Free PMC article.
  • Establishment and health management application of a prediction model for high-risk complication combination of type 2 diabetes mellitus based on data mining. Luo X, Sun J, Pan H, Zhou D, Huang P, Tang J, Shi R, Ye H, Zhao Y, Zhang A. Luo X, et al. PLoS One. 2023 Aug 8;18(8):e0289749. doi: 10.1371/journal.pone.0289749. eCollection 2023. PLoS One. 2023. PMID: 37552706 Free PMC article.
  • Towards an Optimal KELM Using the PSO-BOA Optimization Strategy with Applications in Data Classification. Yue Y, Cao L, Chen H, Chen Y, Su Z. Yue Y, et al. Biomimetics (Basel). 2023 Jul 12;8(3):306. doi: 10.3390/biomimetics8030306. Biomimetics (Basel). 2023. PMID: 37504194 Free PMC article.
  • Xu M., Li C., Zhang S., Callet P. State-of-the-art in 360 video/image processing: Perception, assessment and compression. IEEE J. Sel. Top. Signal Process. 2020;14:5–26. doi: 10.1109/JSTSP.2020.2966864. - DOI
  • Dahou A., Elaziz M.A., Chelloug S.A., Awadallah M.A., Al-Betar M.A., Mohammed A.A., Al-qaness A.F. Intrusion Detection System for IoT Based on Deep Learning and Modified Reptile Search Algorithm. Comput. Intell. Neurosci. 2022;2022:6473507. doi: 10.1155/2022/6473507. - DOI - PMC - PubMed
  • Cao L., Chen H., Chen Y., Yue Y., Zhang X. Bio-Inspired Swarm Intelligence Optimization Algorithm-Aided Hybrid TDOA/AOA-Based Localization. Biomimetics. 2023;8:186. doi: 10.3390/biomimetics8020186. - DOI - PMC - PubMed
  • Cao L., Wang Z., Yue Y. Analysis and Prospect of the Application of Wireless Sensor Networks in Ubiquitous Power Internet of Things. Comput. Intell. Neurosci. 2022;2022:9004942. doi: 10.1155/2022/9004942. - DOI - PMC - PubMed
  • Bai Y., Cao L., Chen B., Chen Y., Yue Y. A Novel Topology Optimization Protocol Based on an Improved Crow Search Algorithm for the Perception Layer of the Internet of Things. Biomimetics. 2023;8:165. doi: 10.3390/biomimetics8020165. - DOI - PMC - PubMed

Publication types

  • Search in MeSH

Related information

Grants and funding.

  • LY23F010002/Natural Science Foundation of Zhejiang Province

LinkOut - more resources

Full text sources.

  • Europe PubMed Central
  • PubMed Central

full text provider logo

  • Citation Manager

NCBI Literature Resources

MeSH PMC Bookshelf Disclaimer

The PubMed wordmark and PubMed logo are registered trademarks of the U.S. Department of Health and Human Services (HHS). Unauthorized use of these marks is strictly prohibited.

IEEE Account

  • Change Username/Password
  • Update Address

Purchase Details

  • Payment Options
  • Order History
  • View Purchased Documents

Profile Information

  • Communications Preferences
  • Profession and Education
  • Technical Interests
  • US & Canada: +1 800 678 4333
  • Worldwide: +1 732 981 0060
  • Contact & Support
  • About IEEE Xplore
  • Accessibility
  • Terms of Use
  • Nondiscrimination Policy
  • Privacy & Opting Out of Cookies

A not-for-profit organization, IEEE is the world's largest technical professional organization dedicated to advancing technology for the benefit of humanity. © Copyright 2024 IEEE - All rights reserved. Use of this web site signifies your agreement to the terms and conditions.

A novel kernel filtering algorithm based on the generalized half-quadratic criterion

  • Original Paper
  • Published: 01 July 2024

Cite this article

research about image processing

  • Yuanlian Huo 1 ,
  • Zikang Luo 1   na1 &
  • Jie Liu 1   na1  

In this work we combine the kernel method and the generalized half-quadratic criterion, and a kernel adaptive filtering algorithm is proposed based on the generalized half-quadratic criterion (KLGHQC). The generalized half-quadratic criterion (GHQC) guarantees the stability of the algorithm under the environment of the stable distribution noise, and the shape of the GHQC performance surface is determined by a constant, which improves the rate of convergence of the algorithm. Finally, the simulated results in two environments, Mackey–Glass sequence prediction and non-linear system identification. The outcome demonstrates that the KLGHQC algorithm proposed in this research outperforms other kernel filtering algorithms in the filtering accuracy and error magnitude.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save.

  • Get 10 units per month
  • Download Article/Chapter or Ebook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime

Price includes VAT (Russian Federation)

Instant access to the full article PDF.

Rent this article via DeepDyve

Institutional subscriptions

research about image processing

Data availability

All data generated or analyzed during this study are included in this article. Data sharing is not applicable to this article as no datasets were generated or analyzed during the current study.

Julier, S.J., Uhlmann, J.K.: Unscented filtering and nonlinear estimation[J]. Proc. IEEE 92 (3), 401–422 (2004)

Article   Google Scholar  

Skretting, K., Engan, K.: Recursive least squares dictionary learning algorithm[J]. IEEE Trans. Signal Process. 58 (4), 2121–2130 (2010)

Article   MathSciNet   Google Scholar  

Widrow, B., Walach, E.: Adaptive signal processing for adaptive control[J]. IFAC Proc. Vol. 16 (9), 7–12 (1983)

Dong, R., Wang, S.: New optimization algorithm inspired by kernel tricks for the economic emission dispatch problem with valve point[J]. IEEE Access 8 , 16584–16594 (2020)

Cheng, F., Chu, F., Xu, Y., et al.: A steering-matrix-based multiobjective evolutionary algorithm for high-dimensional feature selection[J]. IEEE Trans. Cybern. 52 (9), 9695–9708 (2021)

Pauline, S.H., Samiappan, D., Kumar, R., et al.: Variable tap-length non-parametric variable step-size NLMS adaptive filtering algorithm for acoustic echo cancellation[J]. Appl. Acoust. 159 , 107074 (2020)

Wang, J., Ji, Y., Zhang, X., et al.: Two-stage gradient-based iterative algorithms for the fractional-order nonlinear systems by using the hierarchical identification principle[J]. Int. J. Adapt. Control Signal Process. 36 (7), 1778–1796 (2022)

Antoniadis, A., Paparoditis, E., Sapatinas, T.: A functional wavelet-kernel approach for time series prediction[J]. J. R. Stat. Soc. Ser. B Stat Methodol. 68 (5), 837–857 (2006)

Wu, Q., Li, Y., Xue, W.: A kernel recursive maximum versoria-like criterion algorithm for nonlinear channel equalization[J]. Symmetry 11 (9), 1067 (2019)

Wang, H., Li, X., Bi, D., et al.: A robust student’s t-based kernel adaptive filter[J]. IEEE Trans. Circuits Syst. II Express Briefs 68 (10), 3371–3375 (2021)

Han, M., Zhang, S., Xu, M., et al.: Multivariate chaotic time series online prediction based on improved kernel recursive least squares algorithm[J]. IEEE Trans. Cybern. 49 (4), 1160–1172 (2018)

Zhao, S., Chen, B., Principe, J.C.: Kernel adaptive filtering with maximum correntropy criterion[C]. in The 2011 International Joint Conference on Neural Networks. IEEE, pp. 2012-2017 (2011)

Dong, Q., Lin, Y.: Kernel fraction low power adaptive filtering algorithm against impulse noise[J]. Comput. Sci. 46 , 80–82 (2019)

Google Scholar  

Yuan-Lian, H., Dan-Feng, W., Xiao-Qiang, L., et al.: Kernel adaptive filtering algorithm based on Softplus function under non-Gaussian impulse interference[J]. Acta Phys. Sin. 70 (2), 028401 (2021)

Yuan-Lian, H., Li-Hua, T., Yong-Feng, Q., et al.: Quantized kernel least inverse hyperbolic sine adaptive filtering algorithm[J]. Acta Phys. Sin 71 (22), 228401 (2022)

Santamaria, I.: Kernel adaptive filtering: a comprehensive introduction [Book Review][J]. IEEE Comput. Intell. Mag. 5 (3), 52–55 (2010)

Patel, V., Bhattacharjee, S.S., Christensen, M.G.: Generalized soft-root-sign based robust sparsity-aware adaptive filters[J]. IEEE Signal Process. Lett. 30 , 200–204 (2023)

Kumar, K., Pandey, R., Bhattacharjee, S.S., et al.: Exponential hyperbolic cosine robust adaptive filters for audio signal processing[J]. IEEE Signal Process. Lett. 28 , 1410–1414 (2021)

Kumar, K., Bhattacharjee, S.S., George, N.V.: Modified Champernowne function based robust and sparsity-aware adaptive filters[J]. IEEE Trans. Circuits Syst. II Express Briefs 68 (6), 2202–2206 (2020)

Steinwart, I., Scovel, C.: Mercer’s theorem on general domains: on the interaction between measures, kernels, and RKHSs[J]. Constr. Approx. 35 , 363–417 (2012)

Aronszajn, N.: Theory of reproducing kernels[J]. Trans. Am. Math. Soc. 68 (3), 337–404 (1950)

Sadigh, A.N., Yazdi, H.S., Harati, A.: Convergence and performance analysis of kernel regularized robust recursive least squares[J]. ISA Trans. 105 , 396–405 (2020)

Colding, T.H., Minicozzi, W.P.: A course in minimal surfaces[M]. American Mathematical Soc, Providence (2011)

Book   Google Scholar  

Zhang, N., Ni, J., Chen, J., Li, Z.: Steady-state mean-square error performance analysis of the tensor LMS algorithm. IEEE Trans. Circuit Syst. II Expr. Brief 68 (3), 1043–1047 (2020)

Download references

Author information

Zikang Luo and Jie Liu have contributed equally to this work.

Authors and Affiliations

College of Physics and Electronic Engineering, Northwest Normal University, 967 Anning East Road, Lanzhou, 730070, Gansu, China

Yuanlian Huo, Zikang Luo & Jie Liu

You can also search for this author in PubMed   Google Scholar

Contributions

Yuanlian Huo performed the verification of the experimental designand graphing; Zikang Luo performed the first draft writing and investigation; and Liu Jie do the technique analysis.

Corresponding author

Correspondence to Yuanlian Huo .

Ethics declarations

Conflict of interest.

The authors declare that they have no conflict of interest.

Additional information

Publisher's note.

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Huo, Y., Luo, Z. & Liu, J. A novel kernel filtering algorithm based on the generalized half-quadratic criterion. SIViP (2024). https://doi.org/10.1007/s11760-024-03394-9

Download citation

Received : 18 April 2024

Revised : 16 June 2024

Accepted : 19 June 2024

Published : 01 July 2024

DOI : https://doi.org/10.1007/s11760-024-03394-9

Share this article

Anyone you share the following link with will be able to read this content:

Sorry, a shareable link is not currently available for this article.

Provided by the Springer Nature SharedIt content-sharing initiative

  • Kernel adaptive filtering algorithm
  • Generalized half-quadratic criterion
  • \(\alpha \) stable noise
  • Kernel methods
  • Find a journal
  • Publish with us
  • Track your research

IMAGES

  1. Digital Image Processing Research Proposal [Professional Thesis Writers]

    research about image processing

  2. (PDF) Digital Image Processing Analysis using Matlab

    research about image processing

  3. (PDF) A STUDY ON THE IMPORTANCE OF IMAGE PROCESSING AND ITS APLLICATIONS

    research about image processing

  4. (PDF) Digital Image Processing Using Machine Learning

    research about image processing

  5. Flow Chart of the image processing research method and processes

    research about image processing

  6. (PDF) Image Processing: Research Opportunities and Challenges

    research about image processing

VIDEO

  1. Chopin:24 Preludes-Opus 28-Robert Lortat-laneaudioresearch-2017

  2. What Is Image Processing Technology ? Explained in Details With Example

  3. What is the primary goal of image processing?

  4. Capacitive Textile Touchpad

  5. Processing and displaying data (part 2)

  6. Accelerating 6G connectivity with AI

COMMENTS

  1. Image processing

    Image processing is manipulation of an image that has been digitised and uploaded into a computer. Software programs modify the image to make it more useful, and can for example be used to enable ...

  2. Image Processing: Research Opportunities and Challenges

    Image Processing: Research O pportunities and Challenges. Ravindra S. Hegadi. Department of Computer Science. Karnatak University, Dharwad-580003. ravindrahegadi@rediffmail. Abstract. Interest in ...

  3. (PDF) A Review on Image Processing

    Abstract. Image Processing includes changing the nature of an image in order to improve its pictorial information for human interpretation, for autonomous machine perception. Digital image ...

  4. Deep learning models for digital image processing: a review

    Within the domain of image processing, a wide array of methodologies is dedicated to tasks including denoising, enhancement, segmentation, feature extraction, and classification. These techniques collectively address the challenges and opportunities posed by different aspects of image analysis and manipulation, enabling applications across various fields. Each of these methodologies ...

  5. (PDF) Advances in Artificial Intelligence for Image Processing

    AI has had a substantial influence on image processing, allowing cutting-edge methods and uses. The foundations of image processing are covered in this chapter, along with representation, formats ...

  6. Image Processing

    Image processing is a constantly growing research area that is used in many applications in different fields, such as security, medicine, quality control, and astronomy, among others. It involves different techniques belonging to low-level signal processing; medium-level morphological processing and segmentation for feature selection and ...

  7. Image processing

    Read the latest Research articles in Image processing from Scientific Reports. ... Image processing articles within Scientific Reports. Featured. Article 25 June 2024 | Open Access.

  8. Recent Trends in Image Processing and Pattern Recognition

    Dear Colleagues, The 5th International Conference on Recent Trends in Image Processing and Pattern Recognition (RTIP2R) aims to attract current and/or advanced research on image processing, pattern recognition, computer vision, and machine learning. The RTIP2R will take place at the Texas A&M University—Kingsville, Texas (USA), on November 22 ...

  9. A cognitive deep learning approach for medical image processing

    Automatic analysis of the retinal vascular tree by image processing techniques is essential for many clinical investigations and constitutes a field of scientific research leading 12. Detection ...

  10. J. Imaging

    When we consider the volume of research developed, there is a clear increase in published research papers targeting image processing and DL, over the last decades. A search using the terms "image processing deep learning" in Springerlink generated results demonstrating an increase from 1309 articles in 2005 to 30,905 articles in 2022, only ...

  11. Research Topics

    Computer Vision. Computer vision is the science and technology of teaching a computer to interpret images and video as well as a typical human. Technically, computer vision encompasses the fields of image/video processing, pattern recognition, biological vision, artificial intelligence, augmented reality, mathematical modeling, statistics, probability, optimization, 2D sensors, and photography.

  12. Computer Vision and Image Processing: A Beginner's Guide

    The primary aim of image processing is to improve image quality. Whether it's enhancing contrast, adjusting colors, or smoothing edges, the focus is on making the image more visually appealing or suitable for further use. It's about transforming the raw image into a refined version of itself. Image processing focuses on enhancing and ...

  13. Overview of Research Progress of Digital Image Processing Technology

    Digital image processing technology has gone through rapid development and is extensively applied in daily life and production, with the rapid development of modern information technology. It plays an inestimable role in remote sensing, medicine, recognition and other fields. This paper briefly introduces the basic concept of digital image ...

  14. Frontiers

    The field of image processing has been the subject of intensive research and development activities for several decades. This broad area encompasses topics such as image/video processing, image/video analysis, image/video communications, image/video sensing, modeling and representation, computational imaging, electronic imaging, information forensics and security, 3D imaging, medical imaging ...

  15. Image Processing: Techniques, Types, & Applications [2023]

    Task 1: Image Enhancement. One of the most common image processing tasks is an image enhancement, or improving the quality of an image. It has crucial applications in Computer Vision tasks, Remote Sensing, and surveillance. One common approach is adjusting the image's contrast and brightness.

  16. IET Image Processing

    IET Image Processing is a major venue for pioneering research that's open to all, in areas related to the generation, processing and communication of visual information. Announcement We wish to announce that Professor Farzin Deravi has stepped down after 18 years as the Editor-in-Chief of IET Image Processing .

  17. Deep Learning-based Image Text Processing Research

    Deep learning is a powerful multi-layer architecture that has important applications in image processing and text classification. This paper first introduces the development of deep learning and two important algorithms of deep learning: convolutional neural networks and recurrent neural networks. The paper then introduces three applications of deep learning for image recognition, image ...

  18. Current Trends in Image Processing and Pattern Recognition

    The international conference on Recent Trends in Image Processing and Pattern Recognition (RTIP2R) aims to attract researchers working on promising areas of image processing, pattern recognition, computer vision, artificial intelligence, and machine learning. • Biometrics: face matching, iris recognition, footprint verification and many more.

  19. Image Processing Technology Based on Machine Learning

    Machine learning is a relatively new field. With the deepening of people's research in this field, the application of machine learning is increasingly extensive. On the other hand, with the advancement of science and technology, graphics have been an indispensable medium of information transmission, and image processing technology is also booming. However, the traditional image processing ...

  20. Viewpoints on Medical Image Processing: From Science to Application

    Multi-modal image processing for enhancing multi-modal imaging procedures primarily deals with image reconstruction and artifact reduction. ... In summary, medical image processing is a progressive field of research, and more and more applications are becoming part of the clinical practice. These applications are based on one or more of the ...

  21. 471383 PDFs

    All kinds of image processing approaches. | Explore the latest full-text research PDFs, articles, conference papers, preprints and more on IMAGE PROCESSING. Find methods information, sources ...

  22. Research on Image Processing Technology of Computer Vision Algorithm

    Abstract: With the gradual improvement of artificial intelligence technology, image processing has become a common technology and is widely used in various fields to provide people with high-quality services. Starting from computer vision algorithms and image processing technologies, the computer vision display system is designed, and image distortion correction algorithms are explored for ...

  23. Image processing

    XTC is a supervised deep-learning-based image-restoration approach that is trained with images from different modalities and applied to an in vivo modality with no ground truth. XTC's ...

  24. Application of Swarm Intelligence Optimization Algorithms in Image

    Image processing technology has always been a hot and difficult topic in the field of artificial intelligence. With the rise and development of machine learning and deep learning methods, swarm intelligence algorithms have become a hot research direction, and combining image processing technology with swarm intelligence algorithms has become a new and effective improvement method.

  25. Iris image retrieval using partial matching of image blocks

    Due to technological advancements and security requirements, biometric human identification systems have evolved rapidly [1, 2].This authentication system uses properties of physiological traits such as fingerprints, palm prints, iris, and others that cannot be forgotten, exchanged, lost, or stolen [3-5].Among the physiological traits listed above, the iris contains unique features that remain ...

  26. The Research of Image Interpolation Algorithm of Video on FPGA

    Image interpolation algorithm is an important branch of image processing research. When traditional interpolation is applied, it is easy to cause problems such as edge aliasing and details blurring. In order to solve these problems, this paper combines Canny edge detection to improve the interpolation algorithm, which has better interpolation effect when dealing with images with more details ...

  27. A novel kernel filtering algorithm based on the generalized half

    In this work we combine the kernel method and the generalized half-quadratic criterion, and a kernel adaptive filtering algorithm is proposed based on the generalized half-quadratic criterion (KLGHQC). The generalized half-quadratic criterion (GHQC) guarantees the stability of the algorithm under the environment of the stable distribution noise, and the shape of the GHQC performance surface is ...

  28. DW-D3A: dynamic weighted dual-driven domain adaptation for cross-scene

    1. Hyperspectral image (HSI) is widely studied in the realm of remote sensing image processing for their abundant spectral and spatial information. Compared to multispectral images, HSI offers wide...

  29. image processing associate jobs

    Expertise in optical imaging system design, image processing methodologies, and software development (Matlab and Labview) are required. ... Conduct independent research activities under the guidance of a faculty mentor in preparation for a full time academic or research career.