Applied and Computational Engineering

- The Open Access Proceedings Series for Conferences

Volume Info.

  • Title

    Proceedings of the 5th International Conference on Computing and Data Science

    Conference Date

    2023-07-14

    Website

    https://2023.confcds.org/

    Notes

     

    ISBN

    978-1-83558-259-6 (Print)

    978-1-83558-260-2 (Online)

    Published Date

    2023-12-26

    Editors

    Alan Wang, University of Auckland

    Marwan Omar, Illinois Institute of Technology

    Roman Bauer, University of Surrey

Articles

  • Open Access | Article 2023-12-26 Doi: 10.54254/2755-2721/29/20230708

    Application of YOLOv5 for mask detection on IoT

    The combination of the Internet of Things and deep learning technology is usually accompanied by many problems, such as limited bandwidth and computing resources. IoT combined with deep learning often causes system freezes or delays due to limited computing resources. Upgrading the hardware equipment of the IoT system requires a large economic cost, but using a lightweight deep learning model can reduce the consumption of hardware resources to adapt to the actual scene. In this paper, we combine IoT technology and improve a lightweight deep learning model, YOLOv5, to assist people in mask detection, vehicle counting, and target tracking, which does not take up too many computing resources. We deployed the improved YOLOv5 on the server side, and completed the training in the container. The weight file after training was deployed in Docker, and then combined with Kubernates to get the final experimental results. The resulting graph can be displayed by opening a browser at the edge node and entering the relevant IP address. Users can also perform certain operations on the results in the front end of the browser, such as drawing a horizontal line in the road to complete the local vehicle count. These operations are also fed back to the server for interaction with developers. For improved YOLOv5, the recognition speed and accuracy are faster than before. At the same time, compared with the previous version, the model itself requires less storage space and is easier to deploy, making the model easier to implement in the operation of edge nodes. Theoretical analysis and experimental results verify the feasibility and superiority of the proposed method.

  • Open Access | Article 2023-12-26 Doi: 10.54254/2755-2721/29/20230744

    Project on salary classification

    In this project, The results use three different machine learning algorithms to approach salary classification. The analyzed data used many different variables such as education level, age, and work-class to label each person into two categories, one with a salary greater than 50k and the other with a salary less than or equal to 50k. First of all, this work uses a single decision tree model to visualize data because it is more concise and understandable, and then by using the support vector machine method, the result becomes more accurate. After building two different models, The accuracy was found to be about 86.32%, which is relatively high and reliable. However, higher accuracy may be more persuasive. So, this project uses another model which is the random forest model. This algorithm is considered a highly accurate method because of the number of decision trees that participated. This model explained 87.03% of the accuracy of my result. According to my models, if a person desires a wage increase, that person should do his best to improve his education level, and he needs to have a stable marriage situation and be able to start his own business as much as possible between the ages of 20 to 60.

  • Open Access | Article 2023-12-26 Doi: 10.54254/2755-2721/29/20230765

    Brain tumor MRI images classification based on machine learning

    Recent research has shown machine learning’s outstanding performance on image classifying tasks, including applications on Magnetic Resonance Images. While the former models are overly complicated, this paper proposes a simplified model, which is proven to be both accurate and much less time-consuming. Our proposed method is learned from former research and combines Bias Field Correction, DenseNet, and SE-Net to form a concise structure. With small datasets of T1-weighted and T2-weighted labeled MR brain tumor images, our model spent a short training time of 2 hours and showed excellent performance on classifying pituitary, meningioma, glioma or no tumor with an accuracy of 91.32%. After evaluation, our model is proven to be accurate in distinguishing between 3 of the tumor types with an f1-score of 0.96.

  • Open Access | Article 2023-12-26 Doi: 10.54254/2755-2721/29/20230778

    A study of the transaction volume prediction problem based on recurrent neural networks

    With the rapid development of artificial intelligence technology, intelligent fintech scenarios based on big data are receiving more and more attention, and through the analysis of massive financial class data, accurate decision support can be provided for its various scenarios. By predicting the transaction volume of a financial product of a bank, abnormal transaction flow and gradual change trend can be found 1 day in advance to provide decision support for business department program development, and provide decision support for system expansion and contraction, thus reducing system online pressure or releasing unnecessary system resources. Linear algorithms such as AR model, MA model, ARMA model, etc. have poor prediction results for transaction volumes during holidays in the non-stationary dataset handled in this study due to strong assumptions on historical data. In this paper, we design and implement an LSTM-based trading volume prediction model LSTM-WP (LSTM-WebPredict) using deep learning algorithm, which can improve the accuracy of prediction of holiday trading volume by about 8% based on the linear algorithm by discovering and learning the features of historical data, and the learning ability of the model will gradually increase with the increasing of training data; Not only that, the research of this algorithm also provides corresponding technical accumulation for other business scenarios of time series problems, such as trend prediction and capacity assessment.

  • Open Access | Article 2023-12-26 Doi: 10.54254/2755-2721/29/20230785

    TrajTransGCN: Enhancing trajectory prediction by fusing transformer and graph neural networks

    This paper proposes a novel model named TrajTransGCN for taxi trajectory prediction, which leverages the power of both graph convolutional networks (GCNs) and Transformer. TrajTransGCN first passes the input through the GCN layer and then combines the GCN outputs with one-hot encoded categorical features as input to the transformer layer. This paper evaluates. TrajTransGCN uses real-world taxi trajectory datasets in Porto and compares it against several baselines. The experimental results show that TrajTransGCN outperforms all the other models in terms of both RMSE and MAPE. Specifically, the model achieves an RMSE of 0.0247 and a MAPE of 0.09%, which are significantly lower than those of the other models. The results demonstrate the effectiveness of the proposed model in predicting taxi trajectories, indicating the potential of leveraging both GCN and transformer layers in trajectory prediction tasks. In addition, this paper includes ablation experiments to demonstrate the effectiveness of using one-hot encodings of classification labels in complex real-time scenarios. In addition, a parameter study is carried out to examine how the TrajTransGCN's performance is impacted by the learning rate, the quantity of Transformer layers, and the size of the hidden dimension of the Transformer layer.

  • Open Access | Article 2023-12-26 Doi: 10.54254/2755-2721/29/20230803

    ML-based SDN performance prediction

    Software-defined networking (SDN), a new type of network architecture with the advantages of programmability and centralized management, has become a promising solution for managing and optimizing network traffic in modern data centers. However, designing efficient SDN controllers and applications requires a deep understanding of their network performance characteristics. In this work, we implement a machine learning-based method for SDN performance prediction. Our method uses supervised learning to build a training model based on a set of publicly available real network traffic datasets and then uses the model to predict future network performance metrics, such as RTT, S2C, and C2C. Our method is evaluated in two different SDN distributed deployment structures, demonstrating its effectiveness in network performance prediction. We observed that XGBoost achieves the lowest error in most of the cases in terms of MAE, RMSE and MAPE, and feature selection through PCA fails to further improve the prediction performance of XGBoost.

  • Open Access | Article 2023-12-26 Doi: 10.54254/2755-2721/29/20230874

    Comparison between transformer, informer, autoformer and non-stationary transformer in financial market

    This paper delves into the significance of predicting stock prices and carries out comparative experiments using a variety of models, including Support Vector Regression, Long Short-Term Memory model, Transformer, Informer, Autoformer, and Non-Stationary Transformer. These models are used to train and forecast the China Securities Index, Hang Seng Index, and S&P 500 Index. The results of the experiments are measured using indicators such as Mean Absolute Error and Root Mean Square Error. The findings show that the Non-Stationary Transformer model has the highest prediction accuracy. Additionally, a simple trading strategy is designed for each model and their Sharpe and Calmar ratios are compared. Since Autoformer has the highest Sharpe and Calmar, it can be concluded that Autoformer is the most practical in financial market among the four models. This research contributes to the field of stock price prediction by providing an empirical study on the application of Transformer and its derivative models which have been less explored in this domain. In conclusion, this paper offers valuable insights and recommendations for data scientists and financial engineer and introduces new methods for predicting stock prices.

  • Open Access | Article 2023-12-26 Doi: 10.54254/2755-2721/29/20230894

    Machine learning for sustainable investing: Current applications and overcoming obstacles in ESG analysis

    The intersection of Environmental, Social, and Governance (ESG) issues and Machine Learning (ML) has garnered significant attention in recent years as companies and investors increasingly recognize the paramount importance of sustainable and responsible business practices. ML techniques have been actively explored to tackle various ESG-related challenges, including enhancing ESG data quality and availability, developing comprehensive and dynamic ESG risk models, and optimizing ESG portfolios. The overall process of applying ML models in ESG analysis involves data collection, preprocessing, model training and evaluation, and model interpretation. Commonly used ML models in ESG analysis include logistic regression, decision trees, random forests, and support vector machines. However, there are notable obstacles to overcome, such as the lack of standardization and transparency in ESG data, as well as the potential for bias and ethical concerns in ML-based approaches. Further research and collaborative efforts among researchers and practitioners are crucial to fully realize the potential of ML in enhancing ESG analysis while ensuring transparency, ethical use, and alignment with sustainable and responsible investing principles.

  • Open Access | Article 2023-12-26 Doi: 10.54254/2755-2721/29/20230903

    Can natural language processing accurately predict stock market movements based on Reddit World News headlines?

    This research examines the application of machine learning and natural language processing (NLP) methods to stock market movement forecasting. Many NLP approaches were used to gather and preprocess Dow Jones Industrial Average (DJIA) data and Reddit Global News headlines. The preprocessed data were then used to train three machine learning algorithms (Random Forest, Logistic Regression, and Naive Bayes) to forecast the daily trend of the DJIA. According to the study, the Naive Bayes algorithm, along with Textblob, fared better than the other two models, obtaining an accuracy of 68.59%, which is an improvement above previous research. These findings show how NLP and machine learning may be used to forecast stock market patterns and offer ideas for further study to boost the precision of these models.

  • Open Access | Article 2023-12-26 Doi: 10.54254/2755-2721/29/20230949

    DTI fiber tractography of human brain

    DTI fiber tractography is a powerful tool for investigating the human brain's structural connectivity. It enables us to explore the complex network of fiber pathways that connect different regions of the brain and play a crucial role in its function. In this work, I used DWI(Diffusion-Weighted Imaging) data processing software (such as DiffusionToolkit, and Trackvis ) to construct fiber tracks of the human brain based on MRI(Magnetic Resonance Imaging) data and investigated the brain anatomical structure of a human subject using DTI(Diffusion tensor imaging) fiber tractography. The two software I used was Diffusion Toolkit and Trackvis. Diffusion Toolkit did the preparation work for Trackvis, including data reconstruction and fiber tracking on diffusion-weighted MR(Magnetic Resonance) images. Trackvis was utilized for the tractogram visualization and further analyses of the white matter tracts generated by using DTI fiber tractography. In this work, I successfully used Diffusion Toolkit and Trackvis to construct fiber tracks of the human brain, and the results were correct when compared to the standard brain. Besides, I summarized the principle of DTI and the advantages and disadvantages of the technology.

  • Open Access | Article 2023-12-26 Doi: 10.54254/2755-2721/29/20230972

    Machine learning-based DDoS detection for IoT networks

    DDoS attacks are one of the most dangerous threats to IoT networks, and they involve using attacker-controlled botnets to flood the network with malicious traffic that denies legitimate services. The global DDoS landscape is rapidly evolving, and it has become increasingly important for devices to quickly identify the types of DDoS attacks they face so that they can choose and implement effective countermeasures against known attacks. Machine learning has emerged as a popular approach for detecting DDoS traffic in IoT networks. This paper implements four machine learning models, namely Support Vector Machine (SVM), Decision Tree, Long Short-Term Memory (LSTM), and Random Forest, to perform multiclass classification for DDoS attack detection. The study uses the CICDDoS2019 dataset for evaluation. The results show that all four models can detect most types of DDoS traffic effectively. The Random Forest model achieves the highest overall accuracy of 99.32%, followed by the Decision Tree model with an accuracy of 99.10%. The LSTM and SVM models have slightly lower accuracies at 98.20% and 93.00%, respectively. The study also evaluates the models' performance in terms of precision, recall, and F1 score. Decision Tree outperforms the other models in precision, while Random Forest has the highest recall score. Moreover, the Random Forest model performs the best in terms of the F1 score. In conclusion, this paper demonstrates the effectiveness of machine learning-based approaches for DDoS detection in IoT networks using four popular models. The results illustrate the potential for these models to provide reliable and accurate detection of DDoS traffic, thus enabling effective countermeasures to be taken against this type of attack.

  • Open Access | Article 2023-12-26 Doi: 10.54254/2755-2721/29/20230997

    Signal processing and machine learning in healthcare

    Machine learning is the field of artificial intelligence and its important branch. With the continuous innovation of technology, its application in the medical field is increasingly extensive and in-depth. In view of the current human eye discrimination instability and lack of experience, this paper proposes the application of machine learning method, through the training of various attributes of breast cancer data, so that the breast cancer diagnosis system can automatically diagnose malignant breast cancer patients, reduce the influence of human operation on the existence time and experience.

  • Open Access | Article 2023-12-26 Doi: 10.54254/2755-2721/29/20231181

    Machine learning techniques for predicting home rental prices in India

    Predicting the selling price of houses has become increasingly crucial as land and housing prices rise annually. This task is particularly challenging for metropolitan areas like Chennai and Bangalore. Therefore, there is a growing demand for an easier and more effective approach to forecast house rental prices, ensuring future generations have access to reliable predictions. Several key factors, such as the house's location and area, significantly influence rental prices. In this paper, a dataset comprising ten similar crucial features is utilized. The model is developed using a Python library, where the data is preprocessed and prepared to ensure cleanliness for constructing the model. Various machine learning algorithms, including Random Forest, Linear Regression, Decision Tree Regression, and Gradient Boosting, are employed. Through feature extraction, it is determined that area and property type are the most important features that significantly impact rental prices. Among the techniques used, gradient boosting yields the most satisfactory predictive results for rent based on evaluation metrics like Root Mean Squared Error (RMSE), Mean Absolute Error (MAE), and R-Squared Metric (R2).

  • Open Access | Article 2023-12-26 Doi: 10.54254/2755-2721/29/20231207

    Research on momentum strategy and contrarian strategy in AI stock prediction

    The emergence of ChatGPT has significantly enhanced the recognition and acceptance of artificial intelligence concept stocks within the Chinese stock market. Nevertheless, the short- and long-term fluctuations in the prices of AI companies remain uncertain. Therefore, the purpose of this research is to determine optimal strategy for evaluating the suitability of the contrarian strategy versus the momentum strategy in predicting the stock prices of AI concept stocks in the Chinese stock market. Based on a cross-comparison of the Chinese financial data sources iFinD and Wind Economic Database (EDB), this study collects the price data of AI concept stocks over the past six months, starting from the date of ChatGPT's publication. This study employ Python to model stock price movements for both the momentum and reversal strategies. The goodness of fit is evaluated by comparing the modeled stock prices with the actual stock prices. This study demonstrates that the momentum strategy exhibits greater explanatory power than the contrarian strategy, accurately predicting 84.21% of artificial intelligence concept stocks. However, other studies suggest that while AI concept stocks continue to rise, momentum strategies remain effective, whereas when market sentiment cools down, contrarian strategies become more suitable for Chinese AI concept stocks. Hence, in China, the effectiveness of these strategies may vary depending on the prevailing market conditions.

  • Open Access | Article 2023-12-26 Doi: 10.54254/2755-2721/29/20230726

    Predicting cryptocurrency investment suitability using machine learning techniques

    The study aims to predict the close prices of four different cryptocurrencies (Bitcoin, Ethere-um, Dogecoin, and Cardano) using machine learning techniques and determine which of these cryptocurrencies is suitable for investment. To achieve this goal, we used two popular gradi-ent boosting algorithms: Extreme Gradient Boosting (XGBoost) and Light Gradient-Boosting Machine (LightGBM). Prediction accuracy of the trained model is evaluated by Mean Abso-lute Error (MAE) generated by the methodology of Cross-Validation. Our results show that both XGBoost and LightGBM can effectively predict the close prices of the four cryptocur-rencies, with LightGBM achieving slightly better performance in terms of prediction accura-cy. Based on our analysis, we were able to identify which cryptocurrencies were suitable for investing and provide recommendations for potential investors. Overall, our study highlights the potential of machine learning techniques in predicting cryptocurrency close prices and identifying suitable investment opportunities.

  • Open Access | Article 2023-12-26 Doi: 10.54254/2755-2721/29/20230742

    Prediction of movies popularity in supervised learning techniques

    When movies industry gradually become heavy capital, the prediction of movies’ popularity as well as their commercial potentiality based on historical data has become a popular research topic in the field of data analysis using machine learning models. In this paper, researchers trained three supervised machine learning models (Random Forest, Naive Bayesian Model and Support Vector Regression) using IMBD dataset to predict a movie’s popularity. This research has two outcomes: (1) the Random Forest Model has the highest accuracy rate; (2) the number of Oscar-winner included in both cast and crew is most positively related to a movie’s popularity.

  • Open Access | Article 2023-12-26 Doi: 10.54254/2755-2721/29/20230743

    Bitcoin price prediction based on sentiment analysis and LSTM

    As cryptocurrencies become widely accepted due to technical improvements, reliable approaches to capture their future price movements of them become critical. This study mainly combines weighted sentiment analysis results from social media-related comments and financial news headlines with a stacked LSTM model to predict second-day Bitcoin price evolution. This study also compared our results and the results produced by MLP, RF, and SVM after feeding the sentiment analysis results.

  • Open Access | Article 2024-01-22 Doi: 10.54254/2755-2721/29/20231149

    Query-Based Dialogue Summarization Using BART

    Conversation summarisation is the transformation of long conversational texts into concise and accurate summaries, the importance of which lies in improving the user experience and information filtering. As an important natural language processing task, conversation summarisation can provide concise and accurate information and avoid repetition and redundancy. In the dialogue summarisation task, pre-trained language models can be used to summarise long conversations and generate concise and accurate summaries. The aim of this paper is to investigate the possibility of using bidirectional and auto-regressive transformer models for dialogue summarisation tasks. In our experiments, we analysed the characteristics of the Query-based Multi-domain Meeting Summarization (QMsum) dialogue summarisation dataset, proposed a dialogue summarisation model based on the Bidirectional and Auto-Regressive Transformer model, and designed evaluation experiments to compare its performance with other methods in the dialogue summarisation task. The experimental results show that the results of this thesis are important for facilitating the development of dialogue summarisation tasks and the application of the Bidirectional and Auto-Regressive Transformer model.

Copyright © 2023 EWA Publishing. Unless Otherwise Stated