Applied and Computational Engineering

- The Open Access Proceedings Series for Conferences

Volume Info.

  • Title

    Proceedings of the 4th International Conference on Signal Processing and Machine Learning

    Conference Date






    978-1-83558-349-4 (Print)

    978-1-83558-350-0 (Online)

    Published Date



    Marwan Omar, Illinois Institute of Technology


  • Open Access | Article 2024-03-27 Doi: 10.54254/2755-2721/52/20241111

    Investigation of generative capacity related to DCGANs across varied discriminator architectures and parameter counts: A comparative study

    Generating lifelike images through generative models poses a significant challenge, where Generative Adversarial Networks (GANs), particularly Deep Convolutional GANs (DCGANs), are commonly employed for image synthesis. This study focuses on altering the DCGAN discriminator’s structure and parameter count, investigating their effects on the characteristics of the resulting generated images. Assessment of these models is carried out using the Fréchet Inception Distance (FID) score, a metric that gauges the quality of generated image samples. The research specifically involves substituting some convolutional layers with fully-connected layers, and the ensuing outcomes are thoroughly compared to discern the impact of these structural changes. Furthermore, dropout was used to study the number of the parameters’ influence. This study compared the FID score of the models when the probability is 0, 0.2, 0.4, 0.6 and 0.8. Experimental results showed that the DCGAN with the fully-connected layers’ generated ability was stronger than the original one. Besides, when the probability of the dropout is 0.6, the images generated was the most realistic. Finally, the paper explained the possible reasons for the difference and proposed a better generative model based on DCGAN.

  • Open Access | Article 2024-03-27 Doi: 10.54254/2755-2721/52/20241115

    Data augmentation-based enhanced fingerprint recognition using deep convolutional generative adversarial network and diffusion models

    The progress of fingerprint recognition applications encounters substantial hurdles due to privacy and security concerns, leading to limited fingerprint data availability and stringent data quality requirements. This article endeavors to tackle the challenges of data scarcity and data quality in fingerprint recognition by implementing data augmentation techniques. Specifically, this research employed two state-of-the-art generative models in the domain of deep learning, namely Deep Convolutional Generative Adversarial Network (DCGAN) and the Diffusion model, for fingerprint data augmentation. Generative Adversarial Network (GAN), as a popular generative model, effectively captures the features of sample images and learns the diversity of the sample images, thereby generating realistic and diverse images. DCGAN, as a variant model of traditional GAN, inherits the advantages of GAN while alleviating issues such as blurry images and mode collapse, resulting in improved performance. On the other hand, Diffusion, as one of the most popular generative models in recent years, exhibits outstanding image generation capabilities and surpasses traditional GAN in some image generation tasks. The experimental results demonstrate that both DCGAN and Diffusion can generate clear, high-quality fingerprint images, fulfilling the requirements of fingerprint data augmentation. Furthermore, through the comparison between DCGAN and Diffusion, it is concluded that the quality of fingerprint images generated by DCGAN is superior to the results of Diffusion, and DCGAN exhibits higher efficiency in both training and generating images compared to Diffusion.

  • Open Access | Article 2024-03-27 Doi: 10.54254/2755-2721/52/20241136

    Exploring the potential of federated learning for diffusion model: Training and fine-tuning

    Diffusion models, a state-of-the-art generative model, have drawn attention for their capacity to produce high-quality, divers, and flexible content. However, the training of these models typically necessitates large datasets, a task that can be hindered by challenges related to privacy concerns and data distribution constraints. Due to the amount of data and hardware required for large model training, all centralized training will be done by large companies or labs with computing power. Federated Learning provides a decentralized method that allows for model training across several data sources while maintaining the data's localization, reducing privacy threats. This research proposes and evaluate a novel approach for utilizing Federated Learning in the context of diffusion models. This paper investigates the feasibility of training and fine-tuning diffusion models in a federated setting, considering various data distributions and privacy constraints. This study used the Federated Averaging (FedAvg) technique to train the unconditional diffusion model as well as to fine-tune the pre-trained diffusion mode. The experimental results demonstrate that federated training of diffusion models can achieve comparable performance to centralized training methods while preserving data locality. Additionally, Federated Learning can be effectively applied to fine-tune pre-trained diffusion model, enabling adaptation to specific tasks without exposing sensitive data. Overall, this work demonstrates Federated Learning's potential as a useful tool for training and fine-tuning diffusion models in a privacy-preserving manner.

  • Open Access | Article 2024-03-27 Doi: 10.54254/2755-2721/52/20241138

    The application of federated learning in face recognition: A systematic investigation of the existing frameworks

    This paper presents a thorough examination of the recent progress made in applying federated learning to the field of face recognition. As face recognition technology continues to gain widespread adoption across various sectors, issues related to data privacy and efficiency have taken center stage. In response, federated learning, characterized by its decentralized machine learning approach, has emerged as a promising solution to tackle these pressing concerns. This review categorises the current federated learning frameworks for face recognition into four main purposes: Training Efficiency, Recognition Accuracy, Data Privacy, and Spoof Attack Detection. Each category is explored in-depth, highlighting the principles, structures, applicability, and advantages of the frameworks. The paper also delves into the challenges faced in the integration of federated learning and face recognition, such as high computational overhead, model inconsistency, and data heterogeneity. The review concludes with recommendations for future research directions, emphasising the need for model compression, asynchronous communication strategies, and techniques to address data heterogeneity. The findings underscore the potential and challenges of applying federated learning in face recognition, paving the way for more secure and efficient facial recognition systems.

  • Open Access | Article 2024-03-27 Doi: 10.54254/2755-2721/52/20241204

    Exploring the potential of data augmentation in poetry generation with small-scale corpora

    Poetry generation is a complex task in the field of natural language processing, especially when working with small datasets. Data augmentation techniques have been shown to be an effective way to improve the performance of deep learning models in various tasks, including image classification and speech recognition. Therefore, this study focuses on exploring the impact of four different data augmentation methods - Synonym Replacement, Random Insertion, Random Swap, and Random Deletion - on the performance of poetry generation with a small poetry dataset. The results of the study reveal that Random Insertion performed well in terms of Bilingual Evaluation Understudy (BLEU), Recall-Oriented Understudy for Gisting Evaluation (ROUGE), and manual evaluation when compared to other data augmentation techniques. Synonym Replacement performed poorly in all three evaluations. This study confirms the potential value of data augmentation technology in poetry generation tasks and provides innovative perspectives and directions for future research in this area. Data augmentation can be employed to help address the problem of limited data in poetry generation tasks and enhance the efficiency of deep learning models. Future research could focus on exploring more advanced data augmentation techniques and their impact on poetry generation tasks.

  • Open Access | Article 2024-03-27 Doi: 10.54254/2755-2721/52/20241228

    Designing a bias-rating news recommendation system

    Media bias can significantly influence public perception, often subconsciously shaping opinions. To understand and measure this bias, diverse methodologies have emerged. While models from social sciences offer in-depth evaluations, they involve intensive manual analysis. In contrast, computerized models provide speed but often lack depth. This research explores the synergy between these disciplines, aiming to create a robust bias detection tool that combines the meticulousness of social science models with the automation of computer science. Using this interdisciplinary approach, a system was developed to evaluate articles and instantly present a 'bias score' on the user interface. This score offers readers an immediate indication of potential news slant. The research also integrated web crawling techniques into the system, allowing it to identify and recommend alternative articles on analogous subjects. This innovative feature enriches readers' choices, equipping them with multiple narratives for an enriched understanding. In conclusion, this work bridges the gap between depth and speed in media bias detection, offering a novel tool that promotes informed readership. The contribution of this study lies in its interdisciplinary approach and the development of a system that fosters holistic media consumption.

  • Open Access | Article 2024-03-27 Doi: 10.54254/2755-2721/52/20241231

    DeBERTa with hats makes Automated Essay Scoring system better

    Automated Essay Scoring (AES) is a rapidly growing field that applies natural language processing (NLP) and machine learning techniques to the analysis and evaluation of academic essays. By automating the process of evaluating essay quality, AES not only greatly reduces the workload of human graders but also ensures consistency and objectivity in the evaluation process. AES systems can evaluate essays based on multiple criteria, including organization, coherence, and content. With the advent of deep learning, AES has shown significant improvements in accuracy and reliability. AES systems have numerous applications in education, particularly in large-scale assessment and feedback loops. In this article, we delve into the use of an improved Bidirectional Encoder Representations from Transformers (BERT) architecture with disentangled attention mechanism known as DeBERTa for student question-based summarization. This is one of the downstream tasks within AES, which is of great significance for student learning assessment. The organic combination of DeBERTa-v3 and diverse hats like Light Gradient Boosting Machine (LGBM) algorithm and Extreme Gradient Boosting algorithm (XGBoost) has proven to be highly effective in achieving excellent results in this task, indicating their significant potential in real-world AES systems.

  • Open Access | Article 2024-03-27 Doi: 10.54254/2755-2721/52/20241232

    An analysis of BERT-based model for Berkshire stock performance prediction using Warren Buffet's letters

    The objective of this study is to discover and validate effective Bidirectional Encoder Representations from Transformers (BERT)-based models for stock market prediction of Berkshire Hathaway. The stock market is full of uncertainty and dynamism and its prediction has always been a critical challenge in the financial domain. Therefore, accurate predictions of market trends are important for making investment decisions and risk management. The primary approach involves sentiment analysis of reviews on market performance. This work selects Warren Buffett’s annual letters to investors and the year-by-year stock market performance of the Berkshire Hathway as the dataset. This work leverages three BERT-based models which are BERT-Gated Recurrent Units (BERT-GRU) model, BERT-Long short-term memory (BERT-LSTM) model, and BERT-Multi-Head Attention model to analyse the Buffett’s annual letters and predict the Berkshire Hathway’s stock price changes. After conducting experiments, it could be concluded that all three models have a certain degree of predictive capability, with the BERT-Multi-Head Attention model demonstrating the best predictive performance.

  • Open Access | Article 2024-03-27 Doi: 10.54254/2755-2721/52/20241234

    Exploration, detection, and mitigation: Unveiling gender bias in NLP

    Natural Language Processing (NLP) systems have a mundane impact, yet they harbour either obvious or potential gender bias. The automation of decision-making in NLP models even exacerbates unfair treatment. In recent years, researchers have started to notice this issue and have made some approaches to detect and mitigate these biases, yet no consensus on the approaches exists. This paper discusses the interdisciplinary field of linguistics and computer sciences by presenting the most common gender bias categories and breaking them down with ethical and artificial intelligence approaches. Specific methods for detecting and minimizing bias are shown around biases present in raw data, annotator, model, and the linguistic gender system. In this paper, an overview of the hotspots and future perspectives of this research topic is presented. Limitations of some detection methods are pinpointed, providing novel insights into future research.

  • Open Access | Article 2024-03-27 Doi: 10.54254/2755-2721/52/20241252

    Unleashing the power of Convolutional Neural Networks in license plate recognition and beyond

    This article explores the application of Convolutional Neural Networks (CNN) in the field of license plate recognition. It begins by introducing the architecture of CNN, which consists of three key layers: Convolutional Layers, Pooling Layers, and Fully Connected Layers. The article then references three relevant papers that demonstrate how CNNs are applied in license plate recognition. The first paper utilizes TensorFlow to construct a CNN model and integrates it with an STM32MP157 embedded chip for license plate recognition. The second paper presents a real-time car license plate detection and recognition method called Multi-Task Light CNN, emphasizing robustness. The third paper employs the ResNet+FPN feature extraction network of the Mask R-CNN model and annotates a license plate dataset. The article highlights the promising future of CNNs in various fields beyond license plate recognition, emphasizing their potential for further development and industrial applications. CNNs have proven to be versatile and powerful tools in computer vision, offering solutions to a wide range of problems. Their adaptability and effectiveness make them a key player in the ongoing advancement of artificial intelligence and automation technologies.

  • Open Access | Article 2024-03-27 Doi: 10.54254/2755-2721/52/20241263

    An investigation into the short-circuit characteristics of Sic MOSFET power devices

    Silicon carbide (SiC) metal-oxide-semiconductor field-effect transistor (MOSFET) devices exhibit substantial prospects for application under extreme operational conditions, including elevated temperatures, high voltages, and high frequencies. Nevertheless, owing to their distinctive material and structural attributes, SiC MOSFET devices are not devoid of challenges, with the short-circuit phenomenon constituting a pivotal avenue of inquiry. The short-circuit effect pertains to the abrupt escalation of leakage current that these devices might undergo under elevated voltage conditions, thereby exerting a perturbing influence on their stability and reliability. Investigations into the short-circuit effect predominantly revolve around two dimensions: one involves comprehending its underlying physical mechanisms, while the other centers on identifying commensurate remedial approaches.With respect to the underlying physical mechanisms, researchers have discerned that the elevated breakdown field strength and augmented carrier mobility intrinsic to SiC materials engender an augmentation in leakage current, consequently giving rise to the short-circuit effect. Furthermore, factors such as oxide layer anomalies and surface states are also conceivable catalysts for the surge in leakage current. To rectify this predicament, scholars have proffered a panoply of stratagems, encompassing the optimization of material synthesis processes, enhancement of oxide layer quality, refinement of device structural designs, and incorporation of protective circuitry, among others. In summation, the investigation of the short-circuit effect in silicon carbide MOSFET devices is fundamentally aimed at attaining an in-depth comprehension of its causative mechanisms. Moreover, it endeavors to proffer efficacious resolutions conducive to augmenting the reliability and steadfastness of these devices within high-temperature and high-voltage environments, thereby facilitating their widespread integration within the ambit of high-performance power electronics.

  • Open Access | Article 2024-03-27 Doi: 10.54254/2755-2721/52/20241292

    An analysis of different methods for deep neural network pruning

    Neural network pruning, the process of removing unnecessary weights or neurons from a neural network model, has become an essential technique for reducing computational cost and increasing processing speed, thereby improving overall performance. This article has grouped current pruning methods into three classes—channel pruning, filter pruning, and parameter sparsification—and discussed how each method works. Each approach has its own strengths: channel pruning is particularly useful for reducing model depth and width, filter pruning is more suitable for maintaining model depth while decreasing storage requirements, and parameter sparsification can be applied across various network architectures to achieve both storage and computational efficiency. This work will delve into how each method works and highlight key related works of each category. In the future, it is expected that future research in neural network pruning could focus on developing more sophisticated techniques that can automatically identify important weights or neurons within a network.

  • Open Access | Article 2024-03-27 Doi: 10.54254/2755-2721/52/20241321

    Effectiveness of finetuning pretrained BERT and deBERTa for automatic essay scoring

    With the growing importance of summary writing skills in the educational system and the inherent complexity of manual assessment, there is an urgent need for automated summary scoring solutions. Pre-trained models are popular nowadays, such as Bidirectional Encoder Representations from Transformers (BERT) and Decoding enhanced BERT with disentangled attention (deBERTa). The performance of direct use with trained models on specific tasks still needs to be improved. This paper focuses on the impact on the performance of summary scoring systems after adding linear and dropout layers to these pre-trained models for feature extraction and dimensionality reduction operations. The paper details the optimization for the particular task of summary scoring automation after using the pre-trained models. This paper focuses on adding linear and dropout layers to perform feature extraction and dimensionality reduction operations. The aim is to make the model more adaptable to this specific educational task. Ultimately, it is hoped that these studies will enhance the pedagogical toolkit for educators and enrich the academic experience for students.

  • Open Access | Article 2024-03-27 Doi: 10.54254/2755-2721/52/20241322

    Performance evaluation of Latent Dirichlet Allocation on legal documents

    Latent Dirichlet Allocation (LDA) is an algorithm with the capability of processing large amount of text data. In this study, the LDA is used to produce topic modelling of topic clusters from corpus of legal texts generated under 4 topics within Nigeria context– Employment Contract, Election Petition, Deeds, and Articles of Incorporation. Each topic has a substantial number of articles and the LDA method proves effective in extracting topics and generating index words that are in each topic cluster. At the end of experimentation, results are compared with manually pre-annotated dataset for validation purpose and the results show high accuracy. The LDA output shows optimal performance in the word indexing processing for Election Petition as all the documents annotated under the topic were accurately classified.

  • Open Access | Article 2024-03-27 Doi: 10.54254/2755-2721/52/20241380

    Sentiment analysis of Twitter user text based on the BERT model

    Deep Neural Networks (DNNs) utilizing Recurrent Neural Network (RNN) architectures have found extensive application in text sentiment analysis. A prevailing notion suggests that augmenting the model's capacity can significantly improve accuracy and overall model performance. Building upon this premise, this paper advocates the adoption of a larger BERT model for text sentiment analysis. Bidirectional Encoder Representations from Transformers (BERT) is a sophisticated pre-trained language comprehension model that leverages Transformers as feature extractors. However, as the amount of model data increases, exceeding the memory limitations of a single GPU, algorithm optimization becomes crucial. Therefore, this paper employs two methods, namely data parallelism and GPipe parallelism, to accelerate and optimize the BERT model. Compared to a single GPU, training speed almost linearly increases with the addition of more GPUs. In addition, this research investigates the accuracy of the most advanced language model, chatgpt, by reannotating the dataset. During training, it was observed that the accuracy of the chatgpt-annotated dataset significantly declined in both RNN and BERT models. This indicates that chatgpt still exhibits some errors in sentiment text analysis.

  • Open Access | Article 2024-03-27 Doi: 10.54254/2755-2721/52/20241443

    Analysis of computational power as a potential breakthrough in advancing AI technology

    The term Artificial Intelligence(AI) has become more common in view recently. The performance and fame of ChatGPT brought a new AI fever to present industries with multiple big companies announcing their upcoming “AI plans”. People are arguing about whether AI will help them or steal their jobs. It seems that AI will be, if not have been, walking into people’s daily lives. As a result, this article analyzes the current challenges and potential future breakthroughs of Artificial intelligence by focusing on one of the most fundamental factors that support the development and operation of AI—computational power. This article analyzes the relationship between AI performance and computer performance from different eras particularly. It summarizes and analyzes several sources published in related fields. The primary purpose of this article is to provide people with a better overview of present AI technology by taking a close look at the history, present difficulties, and potential solutions of AI and computational power and concludes with possibilities of each solution and expected futures of the AI industry. This paper concludes that the development of AI relies on the computer performance acquired by the industry. Finding a way to obtain better computer performance at a lower cost might be the next breakthrough in the AI industry.

  • Open Access | Article 2024-03-27 Doi: 10.54254/2755-2721/52/20241446

    Learning based multi-robot coverage algorithm

    Multi-robot coverage algorithm is essential in exploration, search and rescue, tracking and other tasks. Nowadays, global planning-based approaches are difficult to solve the actual deployments of very large robot team coverage problems. In this article we use the heuristic algorithm based on graph neural networks to solve the multi robot coverage algorithm. Firstly, we discretize the coverage task and encode it into a graph. The location of graph and the robots are nodes. Then we design a graph neural network controller and use imitation methods to train the controller. The controller will generate the solution that is not inferior to the expert through imitating an open-loop expert solution based on VPR. Finally, we designed a graph neural network architecture to perform zero shot generalization on large maps and teams, enabling the system to be extended to larger map teams. It is difficult for the expert. And we successfully use this model to simulate 10 quadcopter and a number of buildings in a city. We also prove the GNN controller is better than the method based on the planning in the exploration task.

  • Open Access | Article 2024-03-27 Doi: 10.54254/2755-2721/52/20241449

    Comparison between tendon-driven and soft pneumatic actuators in soft robotic hands

    Bionic robotic hands have huge potential for applications. Traditional rigid robotic hands have the disadvantages of complex control and poor compliance, while soft robotic hands can well overcome these difficult drawbacks. Tendon-driven actuators and soft pneumatic actuators are two main actuating methods in soft robotic hands which are increasingly used due to their good compliance. This article first reviews the basic principles and structures of these two actuating methods in order to develop a basic understanding of different actuating methods. Then through comparing the dexterity and grasp performance of two specific designs, this article analyzes the characteristics of both methods in terms of their structures. Through discussion of structural differences, the systematic impact of the two actuating methods on function is analyzed and a general conclusion about differences in applicability is drawn. This article aims to help researchers have a better understanding of the two main actuating methods for soft robotic hands before they choose a specific actuating method, which can contribute to better completion of expected functions.

  • Open Access | Article 2024-03-27 Doi: 10.54254/2755-2721/52/20241470

    Sensor and sensor fusion technology in autonomous vehicles

    The perception and navigation of autonomous vehicles heavily rely on the utilization of sensor technology and the integration of sensor fusion techniques, which play an essential role in ensuring a secure and proficient understanding of the vehicle's environment.This paper highlights the significance of sensors in autonomous vehicles and how sensor fusion techniques enhance their capabilities. Firstly, the paper introduces the different types of sensors commonly used in autonomous vehicles and explains their principles of operation, strengths, and limitations in capturing essential information about the vehicle’s environment. Next, the paper discusses various sensor fusion algorithms, such as Kalman filters and particle filters. Furthermore, the paper explores the challenges associated with sensor fusion and addresses the issue of handling sensor failures or uncertainties. The benefits of sensor fusion technology in autonomous vehicles are also presented. These include improved perception of the environment, enhanced object recognition and tracking, better trajectory planning, and enhanced safety through redundancy and fault tolerance. Lastly, the paper discusses the advancements and highlights the integration of artificial intelligence and machine learning techniques to optimize sensor fusion algorithms and improve the overall autonomy of the vehicle. Following thorough analysis, the deduction can be made that sensor and sensor fusion technology assume a critical function in facilitating efficient and secure autonomous vehicle navigation within intricate surroundings.

  • Open Access | Article 2024-03-27 Doi: 10.54254/2755-2721/52/20241481

    A comprehensive overview of the application of artificial intelligence in language learning

    With the rapid advancement of artificial intelligence, AI technology has made significant progress in the field of language. Machine translation has become the dominant method, replacing manual translation due to its convenience and speed. This article will discuss three different aspects: translation, information retrieval, and language artificial intelligence. In the translation section, three distinct translation models will be analyzed, using Google Translate as a foundation. These models have transformed the translation industry and improved accuracy and efficiency. In the information retrieval section, the differences between semantic search involving AI and traditional keyword-based search techniques will be explored. Semantic search, driven by AI, provides more accurate and relevant search results by understanding the context and intent behind user queries. The impact of these advancements on search engine optimization (SEO) practices will also be discussed. Furthermore, the article will delve into the types of speech recognition and classify speech recognition technologies. Finally, the article will summarize the entire content and provide an outlook on future developments.

Copyright © 2023 EWA Publishing. Unless Otherwise Stated