Proceedings of the 5th International Conference on Computing and Data Science
Marwan Omar, Illinois Institute of Technology
Roman Bauer, University of Surrey
Alan Wang, University of Auckland
The volatility of Australian companies' stock prices in 2020, caused by China's trade restrictions, poses a significant challenge for predicting financial gain or loss. This research contributes to future scholarship in predicting stock prices under specific circumstances or during special time periods. The study proposes a novel approach to stock price prediction, incorporating news sentiment analysis into a deep learning model. The research collected news items potentially affecting the stock price, incorporating them into an analysis model to generate a new feature for the Long Short-Term Memory (LSTM) model. The LSTM model used in this study was bidirectional, with two sets of gates per layer, and a three-layer model with different units. Each layer employed a dropout layer and a dense layer in the final stage. The study also utilized the feature engineering of lookback, selecting a window of time in the past to predict the next day's stock prices. Following multiple hyperparameter tunings and feature engineering adjustments, the results and graphs demonstrate a successful prediction for all three of the chosen companies, even during an unstable stock market. The overall trend lines achieve optimal predictions for the stock prices, illustrating both upward and downward trends.
The use of taxis as a fundamental mode of transportation in everyday life has led to the increased popularity of various ride-hailing applications such as Uber and Lyft, enabling users to conveniently request and view the predicted fare for their desired destination. Accurately predicting the fare is thus of significant importance. In this study, machine learning models were employed to forecast taxi fares based on factors such as distance and passenger count. As the initial data only contained latitude and longitude values, the Haversine formula was utilized to calculate the distance between two locations. Moreover, the raw data was plagued with inconsistencies such as negative fares and grossly exaggerated distances, which were resolved by implementing four data cleaning criteria. Following the preprocessing stage, three distinct models (i.e., linear regression, decision tree, and random forest) were trained and evaluated using the root mean square error metric. The results indicated that the random forest model produced the smallest error (1.264), followed by the decision tree model with a similar error rate (1.277), and lastly, the linear regression model with the highest error (1.718). Thus, the random forest model demonstrated superior performance and is recommended for accurate fare predictions.
As electric vehicles become more popular, battery swap stations are gaining attention as a new type of charging facility. However, the charging process for electric vehicles involves privacy information such as user location and charging mode, which can be easily stolen or leaked, posing security risks and personal privacy concerns for users. Therefore, protecting the privacy of electric vehicle battery swap station users has become an important issue. This paper aims to study a privacy protection system for electric vehicle battery swap stations using blockchain technology. First, the basic principles and application scenarios of blockchain technology are introduced. Second, potential privacy leaks in electric vehicle battery swap stations are analysed, and a privacy protection scheme based on blockchain is proposed, including anonymous identity authentication, zero-knowledge proof, and encrypted communication. Third, a blockchain-based privacy protection system for electric vehicle battery swap stations is designed and implemented, and its performance is experimentally evaluated and compared with traditional privacy protection schemes in terms of security and efficiency. This paper demonstrates that the blockchain-based privacy protection scheme for electric vehicle battery swap stations possesses high levels of security and reliability, effectively safeguarding users' privacy information. Furthermore, this scheme exhibits promising application prospects and potential for widespread adoption. With the continuous development and utilization of blockchain technology, the privacy protection scheme for electric vehicle battery swap stations using blockchain is expected to provide users with more secure, reliable, and convenient charging services.
In the past years, digital signature development has rapidly with new products combined with blockchain, named distribute networks, and quantum computers, while there plays a vitally important role in file authorization and verification. In combination with various new technologies, digital signatures present a vigorous vitality, and new algorithms are widely used in varieties of scenarios including banking, financial services, and insurance (BFSI), education, E-government, healthcare, and the military. In this case, there is no paper illustrating a summary of those new digital signature applications, which is the aim of this paper working on. This paper indicates the technology details of digital signatures and blockchain. And the paper discusses which digital signature algorithms are used in different fields to give an overview of the relationship between algorithms and scenarios. Furthermore, the paper demonstrates the comparison in the most commonly used digital signature algorithm containing Rivest–Shamir–Adleman (RSA) algorithms, Lamport algorithms, Elliptic Curve Digital Signature Algorithm (ECDSA), and Edwards-curve Digital Signature Algorithm (EdDSA) algorithms on their difference in performance.
The purpose of designing this system is to provide schools with a new way to make score certifications of a student and a safer way to search for a student’s previous academic scores. When a student is applying for a new school to further his study, the school always uses his previous academic scores to decide whether to enroll him. The current way for a school to mark a student and to search for a new student’s score record is depend on the student himself, which faces the problem of illegal modification and data leakage. The system mentioned in this paper can add a certification that includes the time information and score information. It can also use the digital signatures of teachers and headmasters to ensure that the scores are given by the corresponding teacher. This system does not include the direct Grade Point Average (GPA), but just includes the raw scores of each subject. Because different schools have different ways to turn the raw scores into a GPA and they also have different GPA upper limitations.
Recently, deep learning has gained considerable success and acceptance in a variety of fields, attracting an increasing number of researchers who are delving deeper and gaining a broader perspective on the subject. It provides more sustainability and opportunities to advance the development of society and transform the lives of individuals. Consequently, it is crucial for individuals to understand the neural network development path. This paper provides a concise overview of the structure and components of Convolutional Neural Network, as well as some of the most well-known and influential learning models in the history of its development. Through an analysis of various models of convolutional neural network, the workings of convolutional neural networks were investigated. The paper discovered that the structure of neural networks is becoming deeper and more complex in order to achieve greater efficacy and avoid the overfitting issue. For researchers to enhance and advance neural network performance, there are still numerous parameters and perspectives to improve and advance.
In today’s society, the 3D modelling field has become an indispensable part of people’s lives. Whether it is the metaverse, games, industry, or medicine, 3D modelling cannot be separated from them. The models used in various fields are also diverse. In the models of various fields, there are not many models for reference in the food field’s biscuit modelling and there are no special views. This article creates a Chinese New Year biscuit with the Chinese handwritten character “Spring” on it based on the different styles and types of biscuit models in the food industry. OpenSCAD software was used to create this model. This article first introduces the methods and usage of OpenSCAD software that are encapsulated for modelling. Then this paper describes in detail how to layer and create biscuits layer by layer. How to use code to create a handwritten Chinese character and a cream biscuit completely in OpenSCAD is detailed by this article. For the result, this article implements a Chinese New Year biscuit with a reference to the “Spring” character written by the ancient Chinese calligrapher Yan Zhenqing. This work fully implements every stroke of handwritten Chinese characters in the cookie model, and these details have been recognized by the majority of people in the final evaluation.
Currently, a lot of studies have been done on the core of Bitcoin, the blockchain. It offers a wide range of distribution mechanisms and infrastructure that keeps the data constant, unchanging, and time consistent. The blockchain is an ideal tool for assertion class applications to offer digital proof of ownership and time stamps as a result of the creation of digital summaries of physical or digital assets. So it is possible to apply the block chain to a wide range of fields, for example, online paying, trading and so on. As the block chain develops, the safety issues directly affect the effectiveness and integrity of the trade. In essence, these questions are about the safety of information. To ensure the security of the data, this paper studies and uses the security of the digital signature. The classification and characteristics of each kind of digital signature are introduced in this article, as well as some other scholars' achievements in this field are analyzed.
The YOLOv5 algorithm has gained popularity in recent years as an effective solution for real-time object detection in images and videos. This paper explores its potential for solving the problem of target tracking detection by proposing a modified YOLOv5 architecture that integrates object detection and tracking capabilities.The proposed YOLOv5-based tracking system includes three major components: object detection, object tracking, and object association. The object detection component uses a YOLOv5 model to detect and localize the target object in each frame of the video. The object tracking component then tracks the target object across frames using a Kalman filter and a Hungarian algorithm for data association. Finally, the object association component uses a motion model to handle occlusions and re-identifies the target object when it reappears in the field of view.The performance of the proposed YOLOv5-based tracking system is evaluated on several benchmark datasets, and its results are compared to state-of-the-art tracking algorithms. The experimental results show that the system achieves competitive tracking accuracy and real-time processing speed. Additionally, the effectiveness of the proposed motion model for handling occlusions and re-identification of the target object is demonstrated. In conclusion, the YOLOv5 algorithm has promising potential for target tracking detection in real-world scenarios, and it could have various applications in surveillance, robotics, and autonomous driving.
Medical delivery robot refers to the delivery robot used in the medical field. Compared with ordinary delivery robots, the medical delivery robot needs to work in an environment with many people, which means that it needs to deal with many random obstacles at any time. This article will discuss automatically avoiding obstacles, and formulate and compare algorithms to analyze the advantages and disadvantages of automatic road exploration algorithms of different algorithms. In order to accomplish this goal, this paper will use Matlab as the main development tool and use the A-star algorithm and the Euclid algorithm as the main heuristics in the main mathematical model of the program. This program needs to be able to complete the obstacle avoidance task in a map with random size, random position, random shape, and random number of obstacles, and be able to reach the end point from the starting point. In addition, this paper will also discuss the efficiency of the A-star algorithm and the Dijkstra algorithm in obstacle avoidance and route planning, and demonstrate why the A-star algorithm is more efficient by taking time and congestion indicators.
With the development of a new round of scientific and technological revolution, big data, cloud computing, and other information technologies have injected new momentum into the research of artificial intelligence. The continuous improvement of artificial intelligence technology has had an impact on social life that cannot be ignored. In the wave of digital transformation of global enterprises, the low level of digital construction and blocked digital transformation of small and medium-sized enterprises (SMEs) in China are still prominent. How to seize the opportunities brought by the development of artificial intelligence and effectively apply artificial intelligence to promote the transformation and upgrading of SMEs? To clarify this issue, this paper first uses crawling technology to get the annual reports of SMEs from 2017 to 2021 and analyses how much emphasis SMEs place on AI through Python. Then, this paper analyses the transformation difficulties faced by Chinese SMEs and systematically sorts out the technical means of artificial intelligence that could embed into the main functions of enterprises. Finally, this paper puts forward specific measures for Chinese SMEs to actively apply artificial intelligence for digital transformation and provides suggestions for SMEs to better integrate into the era of the digital economy.
This study explores the performance of various deep learning models including ResNet152, VGG16, VGG19 and ResNet256 on the dog breed classification task. During training, observe the loss and accuracy trends. The loss gradually decreases, showing that the model is fitting the training data better. With the improvements of the capacity of the model, the accuracy trend shows a steady increase. These models converge after about 20 epochs and fluctuate little after that. The initial learning rate, adjustment factor and patience parameters play key roles in the convergence process. However, the achieved accuracy is below 90%, suggesting that further optimization or more complex architectures may be beneficial. Among all models, ResNet512 has the highest overall accuracy (83%), followed by ResNet256 (83%), VGG19 256 (79%) and VGG16 256 (78%). The ResNet model outperforms the VGG model in most cases, probably because its network structure reduces computational complexity while maintaining accuracy. Increasing the input size can improve the accuracy of the same network structure, such as ResNet 256 and ResNet 512, while modifying the network structure by adding more layers. Learning rate decay scheduling methods, such as ReduceLROnPlateau and CosineAnnealingLR, and optimizers such as SGD, Adam, and Adagrad, are explored as well.
When the blood supply is suddenly blocked in a part of the brain, stroke could be triggered. With blood supply faulted, brain cells would die bit by bit, and handicap happens according to the affected region of the brain. It could cause paralysis or even death to stroke patients. Early symptom recognition can be very helpful in predicting stroke and fostering a healthy lifestyle. Considering conducting large-scale manually early diagnosis is almost impossible, machine learning is always regarded as the first choice for automatic risk estimation. In this research, multiple models of machine learning are developed and trained to predict long-term of the risk of suffering from stroke. This study’s main contribution is a randomized forest which presents a high yield which is verified by different kinds of parameters, for instance, F-measure, recall, AUC, accuracy and precision. The outcome of the paper presents that other models are surpassed by random forest model, with 97.6% AUC.
Currently, breast cancer possesses a dominate position among all kinds of cancers. Besides, it is also the major cause of the cancer-related death in China. Considering the existence of various cancer types, it is hard to be accurately diagnosed. After early diagnosis and treatment, the 5-year survival rate of female breast cancer reached 97%. Although early diagnosis is curable, about one third of female breast cancer patients still die of the disease. However, despite early detection and selection of new treatment methods, as many as 50% of women will still have metastasis. At present, as the cause of breast cancer has not been determined, accurate early detection is crucial to reduce mortality. This paper aims to leverage machine learning methods for achieving accurate cancer diagnosis. This paper explains the principal knowledge of logistic regression in detail, and classifies the data set by combining the logistic regression and random forest. It is a binary classification problem, which annotations are made up of malignant and benign. It can be well applied to the logical regression model. The classification results are more than 90% accurate, so machine learning-based solutions have broad application prospects and can produce certain value.
The stock market is one of the most dynamic and complex systems and stock price prediction is essential for investors to make informed investment decisions and minimize risks. Over the years, various techniques have been developed for this topic. This paper studied two machine learning models for predicting the price of stock, leveraging time series analysis, namely Long Short-Term Memory (LSTM) and Convolutional Neural Network (CNN). This essay use time index as labels, and the data was standardized before training the two models with a step-size. A circulation was created for predicting the daily price of stock with the data of the last three days. RMSE and MAE were two indicators used to assess the models in this task. The results of this study indicated that both of the two models performed well in this topic, and the CNN model showed a better performance than LSTM. As a suggestion, investors should consider other factors such as market trends and risk management strategies when relying on these models to ensure a higher accurate result.
A distributed and decentralized ledger widely used in the computer science and financial fields called blockchain has provided safe and fast transactions for multiple parties. Also, check the transaction by each node on the blockchain. The consensus mechanism is the core of the blockchain. It lets all the nodes reach an agreement for those transactions, which ensures security and accuracy and make Bitcoin valuable and popular. Two of the most mainstream Consensus mechanisms are Proof of Work (PoW) and Proof of Stake (PoS), and Proof of Authority (PoA) is the new one that will apply in the future. Many discourses talk about consensus mechanisms, most of which are review papers. Those papers mainly show a specific aspect of a consensus mechanism or introduce the primary notion, but they rarely explain the corresponding relationship between theories and cryptocurrency. So, the purpose is to give a clear structure, connect the consensus mechanism to its application and simplify the reader's understanding. This paper aims to provide an overview of the consensus mechanism, including its general definition, concepts of different mechanism variants, and advantages and disadvantages. For the structure below, the essay introduces the notion of consensus mechanism and how PoW, PoS, and PoA work. Then summarize the papers based on these three consensus mechanisms, describing the theories of many consensus mechanisms and comparing the advantages and disadvantages. The essay also creates a comparison table about these three consensus mechanisms to embody the content above the stem better.
Heart-related illness is the major cause of global death. The optimal solution to tackle this problem and to improve public health is early detection and prevention. Manually diagnosis is tedious and time consuming, which is difficult to be applied for large scale medical inspections, and hence machine learning, computer-based automatic algorithms could be adopted. Logistic regression is a commonly used statistical method for predicting the risk of binary outcomes, such as the presence or absence of heart problems. In this study, logistic regression is leveraged to a dataset of medical records. It is not only developed as an effective model for the early detection of heart disease, but also leveraged for identifying the crucial risk factors of the disease. The results showed that the logistic regression model achieved a high level of accuracy for heart risk prediction, which overall accuracy is 85%. Factors including sex, cholesterol level, age, and blood pressure are observed possessing highest correlations with heart disease.
Nowadays, various patients are suffering from heart disease and even die owing to the disease. According to common knowledge, many health problems can cause heart disease directly or indirectly, e.g., overweight, stroke, high blood pressure, and so on. This study uses Heart Disease Health Indicators Dataset from Kaggle to find out significant indicators of heart disease or heart attack, and predicts heart disease by logistic regression, random forest and LightGBM. Based on the analysis, 10 response variables, including health conditions, living habitats and age are significantly relevant to heart disease. In addition, the comparison among the model shows random forest is the most suitable model to predict heart disease with multicollinearity. This paper selects out important factors of heart disease and provides a fitting model for heart disease prediction. Based on the evaluation models, logistic regression and random forest, this paper finds random forest is the fittest model in prediction. Overall, these results shed light on guiding further exploration of indicators of heart disease.
This paper provides a comprehensive overview of the current state of behavior recognition technology research and its applications in computer vision. Firstly, it discusses the fundamental concepts and categorization methods employed in behavior recognition technology, such as multimodality, double-sided depth photos, bone key points, and RGB data. These techniques enable the recognition and analysis of various human behaviors with high accuracy and precision. Furthermore, this paper highlights the vast potential of behavior recognition technology in several fields, including safety and education. In safety settings, behavior recognition technology can assist managers in identifying abnormal behaviors and enhancing safety precautions. In educational settings, behavior recognition technology can help teachers gain insight into student learning levels and enhance their teaching efficiency. Additionally, this technology can be used to identify patterns of behaviors that might indicate a student is struggling or needs extra support. The paper concludes with a summary of the current state of behavior recognition research and suggests areas for further investigation. One potential area for future research is the development of more accurate and efficient recognition models. Additionally, exploring the ethical implications and privacy concerns of behavior recognition technology is also essential. Overall, this paper emphasizes the immense potential of behavior recognition technology in various fields and encourages further research to realize its full potential. By leveraging the power of computer vision, we can gain valuable insights into human behavior that could have far-reaching implications for safety, education, and other areas of our lives
Portable Document Format, also named PDF, can integrate images, text, tables, or other elements into a single file, making it an easy-to-store and transfer document format. Due to the biggest feature of PDF being its portability, the modification of such files is very difficult and can ensure the security of the text to a certain extent. Therefore, it has been welcomed by a large number of researchers, office workers, and students. But as the demand for PDF files continues to rise, its security vulnerabilities are gradually being exposed. Consequently, a plethora of security measures for PDFs has come into view. This article primarily summarizes a portion of the overarching principles governing PDF security. Subsequently, this paper will focus on introducing the principles of steganography and cryptography in PDF security protocols while also selecting meaningful and representative research papers to summarize the advantages and disadvantages of the proposed methods.