Applied and Computational Engineering
- The Open Access Proceedings Series for Conferences
Proceedings of the 5th International Conference on Computing and Data Science
2023-07-14
978-1-83558-033-2 (Print)
978-1-83558-034-9 (Online)
2023-10-23
Roman Bauer, University of Surrey
Alan Wang, University of Auckland
Marwan Omar, Illinois Institute of Technology
Based on the results of interviews conducted with university students in a certain university in Yunnan Province, this study identifies the problems of excessive information navigation, complex interface layout, and difficulty in locating the school system within the university management system APP. From the perspective of information architecture, this study designs a cognitive model that is suitable for the university management system APP of a certain university in Yunnan Province. By conducting tracking surveys and interviews with students from different majors, the study utilizes the affinity diagram method to construct a cognitive model for student users, which is further categorized by involving 10 participants. The data obtained after categorization is analyzed using hierarchical cluster analysis, and the information construction of the university management system APP is restructured accordingly. Through experiments, the characteristics and existing problems of the informationization construction of the university management system APP are determined, and the navigation label information of the management system is improved. The cognitive model of the university management system APP is reconstructed, resulting in the development of navigation names and classification methods for information construction that align with the cognitive models of student users. The research findings provide a basis and reference for other university management system APPs.
In the VUCA (Volatile, Uncertain, Complex, Ambiguous) era, construction management faces unique challenges, particularly in the realm of workload assessment. This paper proposes an innovative solution to address these challenges: a blockchain-based GameFi model for construction workload assessment. This model leverages blockchain Non-Fungible Token (NFT) technology to issue various badges, reliably representing employees' workload. The technology ensures the authenticity and tamper-proof nature of the badges, thereby fostering employee motivation through a distributed trust system. This study provides fresh insights into the construction industry's digital transformation and is anticipated to lay the groundwork for similar applications in other sectors.
Facial expression recognition with significant implications across fields such as psychology, computer science, and artificial intelligence. This paper proposes a combination of a Feature Pyramid Network (FPN) and a Residual Network (ResNet) to construct a recognition model. The main objective of the proposed model is to refine the multi-level feature representation of facial expressions. This approach aims to provide a more holistic understanding of the diverse and complex nature of facial expressions, recognizing the intricate interplay between macro and micro-expressions. Experimental results underscore the model's considerable superiority over traditional methods, particularly in terms of accuracy and adaptability to objects of varying sizes and complexities. This comprehensive approach to facial expression recognition showcases the potential of integrating different neural network architectures, furthering our understanding of the subtleties of facial expressions. The research, therefore, presents a significant contribution to the field of facial expression recognition, demonstrating the efficacy of integrating multi-scale feature extraction techniques to improve model performance. It sets the stage for future research directions in this domain, paving the way for more sophisticated emotion recognition systems that can be deployed in real-world applications.
Diabetes is now a common disease for modern people. Diabetes will cause some serious symptoms. Diabetes patients will face a painful life, high cure cost, and even death. So, it is necessary to correctly diagnose diabetes and analyze the factors that mainly cause diabetes to prevent the happen of diabetes. This essay mainly focuses on training the computer to help the doctor to train the computer. Three Naive Bayes classifications will be used to train the computer to do the prediction, including the Gaussian Naïve Bayes, Bernoulli Naïve Bayes, and Multinomial Naïve Bayes. To compare each method’s result, accuracy will be the main index to measure whether a method is good enough to put into use. Not only the accuracy, classification report, and confusion matrix also assist to measure the prediction. Finally, the Gaussian Naïve Bayes has the highest accuracy and when combined with the confusion matrix and the classification report, the Gaussian Naïve Bayes has a huge advantage over the other two models. The accuracy of these models still does not satisfy medical demand. Some deep learning and high-level model are expected to optimize this project.
With the increasing use of social media in our daily lives, it is crucial to maintain safe and inclusive platforms for users of diverse backgrounds. Offensive content can inflict emotional distress, perpetuate discrimination towards targeted individuals and groups, and foster a toxic online environment. While natural language processing (NLP) has been employed for automatic offensive language detection, most studies focus on English only, leaving languages other than English understudied due to limited training data. This project fills this gap by developing a novel multilingual model for offensive language detection in 100 languages, leveraging existing English resources. The model employs graph attention mechanisms in transformers, improving its capacity to extend from English to other languages. Moreover, this work breaks new ground as the first study ever to identify the specific individuals or groups targeted by offensive posts. Statistical analysis using F1 scores shows high accuracy in offensive language classification and target recognition across multiple languages. This innovative model is expected to enable multilingual offensive language detection and prevention in social media settings. It represents a significant step forward in the field of offensive language detection, paving the way for a safer and more inclusive social media experience for users worldwide.
With the rapid development of Artificial intelligence (AI), the rapid development of AI also brings many new problems. Among them, the ethical and moral issues brought by AI have attracted people's attention. Due to the rapid renewal of artificial intelligence, many moral problems that have never been envisaged have also emerged. To avoid such problems and provide solutions for the future, this paper mainly analyzes the possible moral and ethical problems brought by artificial intelligence in the future, and adopted solutions and the results after adopting solutions. The research data are taken from published research, reports, government legislation, etc. This research believes that AI should comply with the current moral principles and be released after having solutions that can solve possible moral and ethical problems.
Parkinson’s disease (PD) is a neurodegenerative disease afflicting over 10 million patients worldwide, most commonly the elderly, that causes tremors, stiffness, movement loss, and other symptoms. Since symptoms are often mild and difficult to notice in the early stages of the condition, it can be hard to notice and diagnose until the condition has already become more severe. An earlier diagnosis of PD will allow treatment to begin earlier and lessen the impact of the disease. The goal of this work is to develop an affordable, non-intrusive, and accessible way of diagnosing PD. This neurodegenerative disorder leads to loss of movement control and other symptoms. Since there is no known cure for PD yet, early diagnosis would allow timely treatment and prevent the symptoms from worsening too quickly. Doing so in an affordable and non-intrusive way will minimize costs and maximize efficiency — removing the need for lengthy consulting with doctors and possibly expensive testing and medical equipment. This work presents the FaceTell system, which combines and optimizes traditional machine learning and deep learning to make predictions on the patient’s PD status based on video data of their faces. By analyzing a variety of attributes such as facial expressions and emotion prevalence/intensity, the model was able to achieve a more thorough examination of the patient’s condition and make predictions of similar accuracy compared to prior results. One main innovation was collecting data affordably: sampling publicly available videos from platforms like YouTube. This serves as a proof-of-concept to show that simple, affordable, and non-intrusive data collection methods can still produce viable results. Using methods and tools such as hyperparameter tuning, data cleaning, and Face++, the performance of the system readily improved. The ultimate results obtained include an F1 score of 0.86 and an accuracy of 89%, compared to prior results which reached up to 95% accuracy.
Sarcasm prediction is a text analysis task that aims to identify sarcastic and non-sarcastic statements in text. Sarcasm is a figure of speech that uses opposite or contradictory language to express a certain meaning or idea. Sarcasm is usually cryptic, vague, and suggestive, which makes sarcasm prediction a challenging task. In sarcasm prediction projects, techniques of natural language processing are usually leveraged to analyze and classify the text. The main challenge of this task lies in the fact that sarcasm usually has multiple manifestations and needs to consider the contextual and semantic information of the text. The prediction of sarcasm holds significant application value in natural language processing, such as social media analysis, public opinion monitoring, sentiment analysis and so on. In this paper, by controlling variables, the influence of adding the long short-term memory (LSTM) layer and changing the grid structure of the model on the accuracy of prediction results is explored. Moreover, accuracy of the LSTM prediction performance is compared with that of the bidirectional encoder representations from Transformers (BERT) model. At the same time, this paper analyzed and discussed the phenomenon that adding the number of LSTM model layers could not obtain higher prediction accuracy, and the accuracy gap of prediction results between LSTM model and BERT model, and finally obtained relevant conclusions.
With the development of the Internet, information sharing is higher, and the amount of information that each user is exposed to is increasing. How to find the information peoples want from so much information is a very important question. The vast majority of these resources are related to textual information. The most intuitive manifestation of these problems is that when people usually use search engines, enter a piece of text, and search out the relevant website, if the algorithm is not good, the search results will be very unsatisfactory. Therefore, this paper studies the application of text similarity in text clustering in the Chinese context. First, the basic concept of text similarity is introduced. In addition, text clustering is explained/explained from three aspects: definition, application, and general processing process. Secondly, combined with the existing data, some mainstream clustering algorithms are comprehensively summarized. Then, combined with the above content, the similarity calculation method in text clustering is analyzed. Finally, the above methods are compared and analyzed according to the experimental results in the Python environment.
Brain tumors pose a substantial health challenge globally. Their accurate detection and segmentation are crucial for effective treatment, and recent advancements in machine learning (ML) present a promising solution to these tasks. This paper provides a comprehensive analysis of traditional and modern ML algorithms for brain tumor detection and segmentation. It highlights the pivotal role of ML in advancing brain tumor analysis and how it can potentially mitigate the impact of malignant tumors. Traditional image processing techniques have shown their value but face limitations in dealing with the complexity of brain tumors. The integration of ML has substantially enhanced the capabilities of traditional detection techniques, with architectures such as convolutional neural networks (CNNs) providing improved results. Moreover, brain tumor segmentation techniques have also seen significant enhancements, with the transition from conventional techniques like Region Growing and Watershed methods to state-of-the-art deep learning methods, such as U-Net. Despite these advancements, great challenges remain. Ongoing researches are necessary to further harness the potential of ML in brain tumor diagnosis and treatment. The findings of this review underscore the significance of ML in brain tumor analysis and its profound potential impact on patient outcomes and the overall landscape of cancer treatment.
In real life, facial emotion recognition is very important because it can convey information, build relationships, and facilitate communication. Therefore, emotion recognition technology is used in medicine, education, entertainment, security, and other fields. In the emotion detection field, the Facial Emotion Recognition 2013 Dataset (FER-2013) is a dataset that has been used in many places, that contains images of seven emotional expressions. In the area of detecting emotions from facial expressions, the deep learning structures, especially the convolutional neural networks (CNNs), have demonstrated significant potential since they have the ability to extract features and their computational efficiency. In this paper, the author constructs a model named Improved VGG-16 based on Visual Geometry Group Network of 16 weight layers (VGG-16). To be specific, first, the author adds two dense layers to improve the complexity and expressiveness; second, two dropout layers are used in order to reduce overfitting. An accuracy of 68.0% is achieved by this model on the test dataset of FER-2013. The result is better than some previous methods and shows that the improved VGG-16 model can recognize facial expressions effectively. In conclusion, this work aims to increase the accuracy and reliability of facial emotion recognition, providing support for research and application in related fields.
Recognizing facial expressions automatically is of importance to interactive computer systems since facial expression is an efficient means of conveying emotions. Over the past few years, many researchers have attempted to use deep learning for expression recognition. The advantage of deep learning lies in its ability to learn features from datasets automatically, without relying on hand-crafted features. The paper analyzes what mechanism is useful for expression recognition in deep learning by comparing the performance of different popular models and algorithms from recent research. The paper trains the models on the Facial Expression Recognition Dataset (FER-2013), which is a relatively small and imbalanced dataset. After that, model performance is assessed on the private test dataset of FER-2013. Specifically, Residual Network (ResNet), Visual Geometry Group Network (VGGNet), and MobileNet are evaluated in the experiment. The evaluation is based on running time, the number of parameters, and the accuracy. Squeeze and Excitation (SE) block is utilized in the ResNet, which enables the models to learn useful features from global information. In the paper, ResNet34 inserted with SE block (SE-ResNet34) get the highest private test accuracy of 70.80%. Experimental results show that the residual learning enables models to go deeper without degradation and the SE block is beneficial for the model to learn global information.
The use of VR (virtual reality) in memory recovery has considerable potential, and this is just beginning to materialize. Those exploratory studies conducted so far suggest that VR's participation will help improve the assessment of memory impairment and memory repair using recombinant techniques. Based on the current virtual reality technology is mostly used for game development, the relevant research on memory reproduction is relatively insufficient. In this paper, a memory gallery was built through 3D (three dimensional) modeling, and Metahuman technology was used to achieve character reproduction, combined with unreal5 development to achieve important memory reproduction. This work mainly uses 3D models to make a memory corridor, Metahuman digital virtual human technology and VR Chat VR social combination to bring users immersive precious memories reproduction, and can interact with it. Experiments have shown that recall reproduction is of great help to emotional repair. VR can help by making the presentation more immersive.
Traditional segmentation approaches, supervised deep learning methods, and semi-supervised deep learning methods have all found widespread use as the field of semi-supervised semantic segmentation has advanced. These methods have developed and progressed over time, opening up novel avenues of research in the field of image segmentation and giving potent resources for tackling difficult practical issues. These developments have deepened our understanding of image segmentation and provided flexible and efficient solutions to challenges in practical applications, ranging from classical traditional approaches to supervised methods based on deep learning, and beyond to semi-supervised methods that leverage both labeled and unlabeled data. Focusing on their specialized applications in medical and remote sensing image processing, this paper presents a complete overview of the development status of these methods. This study's image segmentation solutions can help tackle actual-world issues where annotated data is rare or expensive to some extent.
According to the Smithsonian Institution, the art of making music has existed for over 35,000 years. As musical technology has improved, the music of the time has also improved and adapted to the new technology. In the recent expansion of technology from generative AI, text and image generation have become not only possible but also competitive with human-created text and images. As such, the development of AI-generated music is increasingly sparking considerable interest among musicians and developers alike, raising questions about the potential of AI to enhance or even replace human musical creativity. This paper will first explore the advancements of AI-generated music. Next, it will delve into the technologies and methodologies involved in generating music, as well as its current limitations using a basic LSTM (Long Short-Term Memory) model. Finally, it will explore the implications of this music for the whole music industry. By examining these various facets of AI-generated music, this research provides insights into AI's potential role in shaping the future of music. According to the analysis, a rudimentary AI model trained on complex music can produce music that is fairly elementary. Overall, these results shed light on guiding further exploration of the interaction between artificial intelligence and music.
As VR technology advances, VR games have gradually become a research hotspot. This paper investigates a VR virtual game called “Ghost Train,” which was developed for missing persons events. In the game, players can play as ghosts, open related feature platforms, and also play different roles to experience different lives. By doing so, they can experience the disadvantaged groups that need help and the masses that can help them. The purpose of the game is to encourage people to calmly help strangers they encounter in trouble on the road, rather than leaving without a trace. Extensive testing showed that the game design has good performance in terms of interaction and immersion, and can also provide better help for the physical and mental health of missing persons. Overall, this paper’s research provides valuable insights for the development of VR games that can have a positive impact on real-world situations.
Facial expression recognition plays a critical role in numerous applications like emotion analysis, human-computer interaction, and surveillance systems. Given the importance of this task, this study aims to investigate the effectiveness of different depths of Residual Networks (ResNet). The primary objective is to scrutinize and compare these ResNet models in terms of their training and validation losses and performance metrics like accuracy, recall, and F1 scores. In this research, a thorough comparative analysis is conducted by setting up exhaustive experiments using these models. The experiment is carried out on a popular facial expression dataset. Despite the depth differences, ResNet101 emerged as the model demonstrating superior performance. It struck the most effective balance between model complexity and generalization capacity, leading to the lowest validation loss and better performance. Experimental results show that a more complex model does not necessarily yield optimal results. The optimal balance between model complexity and generalisation needs to be investigated. These findings can provide essential guidance in the design of deep learning models for facial expression recognition and other similar tasks.
In order to identify effective metrics that can accurately duplicate the probability distributions resulting from human classifications, this paper analyzes an improved approach for galaxy morphologies classification. At the present stage, this field still faces the problem of insufficient quality and quantity of image data, low accuracy of computer recognition and weak generalization ability of the model. From the previous research, Convolution Neural Network (CNN) can be a valid technique to complete this task but usually spends a large time and space complexity. For the purpose of increasing effectiveness, this paper improves EfficientNetV2S to construct recognition models and characterize their performance in galaxy recognition. The procedure includes data preparation and augmentation, model structure creation, attention mechanism addition, fine-tuning, and result visualization. A Fused mobile inverted bottleneck convolution (Fuse-MBConv) structure was used to accelerate the model's convergence speed. Besides, the Convolutional block attention module (CBAM) was used to improve performance and feature representation capabilities. The model in this study can minimize complexity with the number of parameters and utilize less memory while maintaining excellent accuracy. This research is conducted on the Galaxy10 DECals dataset. Experimental results show that it achieves an 87% high precision with 20.6m parameters which is more efficient than models currently used in previous research.
In astronomy, the automated galaxy classification method based on deep learning has significantly reduced the cost of manual annotation. The degradation problem in convolutional neural networks during galaxy classification tasks limits the accuracy improvement of deep models. Therefore, to address the issue of the model being too deep, which leads to a decrease in accuracy, the paper constructs the galaxy classification model using residual block structures. Specifically, this paper uses an improved ResNet as the backbone, stacking different numbers of residual blocks to extract input features. Meanwhile, smaller and deeper fully connected layers, regularization and activation functions, and Dropout layers are used to improve the model performance. For the best-performing ResNet152 model, the paper analyzes the classification report and confusion matrix and visualizes saliency maps and GradCAM heatmaps. Finally, the experimental results show that the introduction of residual blocks can increase the accuracy of the model by over 30%, and models with more residual blocks perform better, although the influence of the number of residual blocks on accuracy improvement is small. The visualization results show that the model can accurately segment the feature focus areas and points of interest in the original image. The model also focuses more on the central points with high planetary density by stacking multi-level residual blocks.
With the rapid development of computer vision and convolutional neural networks, the task of automatic face emotion classification has become a reality. The aim of this study is to improve the underlying neural network model to achieve effective face emotion classification. By presenting a simplified network to generate the recognition model, the author enhances the underlying neural network architecture. The model, in particular, augments the underlying neural network with a convolutional layer, a maximum pooling layer, and a discard layer, and increases the number of neurons in the dense layer from 25 to 128. The convolutional layer allows for the automatic extraction of sentiment features. To decrease the parameters in the feature maps, the maximum pooling layer is applied. The experiments are constructed on the Facial Emotion Recognition 2013 dataset (FER-2013). The streamlined network model improves performance by 6% to 56.32% as compared with the basic neural network model. Numerous experiments show that the proposed streamlined network model can effectively recognize facial emotions. In addition, the author analysis the confusion matrix and finds that the model has weak feedback for aversive emotions. Future research will focus on improving the representation of unclear features such as aversive emotions to enhance model generalization.