Commenced in January 2007

Frequency: Monthly

Edition: International

Paper Count: 2259

Search results for: ant colony algorithms

9 Interpretable Deep Learning Models for Medical Condition Identification

Authors: Dongping Fang, Lian Duan, Xiaojing Yuan, Mike Xu, Allyn Klunder, Kevin Tan, Suiting Cao, Yeqing Ji

Abstract:

Accurate prediction of a medical condition with straight clinical evidence is a long-sought topic in the medical management and health insurance field. Although great progress has been made with machine learning algorithms, the medical community is still, to a certain degree, suspicious about the model's accuracy and interpretability. This paper presents an innovative hierarchical attention deep learning model to achieve good prediction and clear interpretability that can be easily understood by medical professionals. This deep learning model uses a hierarchical attention structure that matches naturally with the medical history data structure and reflects the member’s encounter (date of service) sequence. The model attention structure consists of 3 levels: (1) attention on the medical code types (diagnosis codes, procedure codes, lab test results, and prescription drugs), (2) attention on the sequential medical encounters within a type, (3) attention on the medical codes within an encounter and type. This model is applied to predict the occurrence of stage 3 chronic kidney disease (CKD3), using three years’ medical history of Medicare Advantage (MA) members from a top health insurance company. The model takes members’ medical events, both claims and electronic medical record (EMR) data, as input, makes a prediction of CKD3 and calculates the contribution from individual events to the predicted outcome. The model outcome can be easily explained with the clinical evidence identified by the model algorithm. Here are examples: Member A had 36 medical encounters in the past three years: multiple office visits, lab tests and medications. The model predicts member A has a high risk of CKD3 with the following well-contributed clinical events - multiple high ‘Creatinine in Serum or Plasma’ tests and multiple low kidneys functioning ‘Glomerular filtration rate’ tests. Among the abnormal lab tests, more recent results contributed more to the prediction. The model also indicates regular office visits, no abnormal findings of medical examinations, and taking proper medications decreased the CKD3 risk. Member B had 104 medical encounters in the past 3 years and was predicted to have a low risk of CKD3, because the model didn’t identify diagnoses, procedures, or medications related to kidney disease, and many lab test results, including ‘Glomerular filtration rate’ were within the normal range. The model accurately predicts members A and B and provides interpretable clinical evidence that is validated by clinicians. Without extra effort, the interpretation is generated directly from the model and presented together with the occurrence date. Our model uses the medical data in its most raw format without any further data aggregation, transformation, or mapping. This greatly simplifies the data preparation process, mitigates the chance for error and eliminates post-modeling work needed for traditional model explanation. To our knowledge, this is the first paper on an interpretable deep-learning model using a 3-level attention structure, sourcing both EMR and claim data, including all 4 types of medical data, on the entire Medicare population of a big insurance company, and more importantly, directly generating model interpretation to support user decision. In the future, we plan to enrich the model input by adding patients’ demographics and information from free-texted physician notes.

Keywords: deep learning, interpretability, attention, big data, medical conditions

Procedia PDF Downloads 91

8 Enhancing Plant Throughput in Mineral Processing Through Multimodal Artificial Intelligence

Authors: Muhammad Bilal Shaikh

Abstract:

Mineral processing plants play a pivotal role in extracting valuable minerals from raw ores, contributing significantly to various industries. However, the optimization of plant throughput remains a complex challenge, necessitating innovative approaches for increased efficiency and productivity. This research paper investigates the application of Multimodal Artificial Intelligence (MAI) techniques to address this challenge, aiming to improve overall plant throughput in mineral processing operations. The integration of multimodal AI leverages a combination of diverse data sources, including sensor data, images, and textual information, to provide a holistic understanding of the complex processes involved in mineral extraction. The paper explores the synergies between various AI modalities, such as machine learning, computer vision, and natural language processing, to create a comprehensive and adaptive system for optimizing mineral processing plants. The primary focus of the research is on developing advanced predictive models that can accurately forecast various parameters affecting plant throughput. Utilizing historical process data, machine learning algorithms are trained to identify patterns, correlations, and dependencies within the intricate network of mineral processing operations. This enables real-time decision-making and process optimization, ultimately leading to enhanced plant throughput. Incorporating computer vision into the multimodal AI framework allows for the analysis of visual data from sensors and cameras positioned throughout the plant. This visual input aids in monitoring equipment conditions, identifying anomalies, and optimizing the flow of raw materials. The combination of machine learning and computer vision enables the creation of predictive maintenance strategies, reducing downtime and improving the overall reliability of mineral processing plants. Furthermore, the integration of natural language processing facilitates the extraction of valuable insights from unstructured textual data, such as maintenance logs, research papers, and operator reports. By understanding and analyzing this textual information, the multimodal AI system can identify trends, potential bottlenecks, and areas for improvement in plant operations. This comprehensive approach enables a more nuanced understanding of the factors influencing throughput and allows for targeted interventions. The research also explores the challenges associated with implementing multimodal AI in mineral processing plants, including data integration, model interpretability, and scalability. Addressing these challenges is crucial for the successful deployment of AI solutions in real-world industrial settings. To validate the effectiveness of the proposed multimodal AI framework, the research conducts case studies in collaboration with mineral processing plants. The results demonstrate tangible improvements in plant throughput, efficiency, and cost-effectiveness. The paper concludes with insights into the broader implications of implementing multimodal AI in mineral processing and its potential to revolutionize the industry by providing a robust, adaptive, and data-driven approach to optimizing plant operations. In summary, this research contributes to the evolving field of mineral processing by showcasing the transformative potential of multimodal artificial intelligence in enhancing plant throughput. The proposed framework offers a holistic solution that integrates machine learning, computer vision, and natural language processing to address the intricacies of mineral extraction processes, paving the way for a more efficient and sustainable future in the mineral processing industry.

Keywords: multimodal AI, computer vision, NLP, mineral processing, mining

Procedia PDF Downloads 68

7 Contactless Heart Rate Measurement System based on FMCW Radar and LSTM for Automotive Applications

Authors: Asma Omri, Iheb Sifaoui, Sofiane Sayahi, Hichem Besbes

Abstract:

Future vehicle systems demand advanced capabilities, notably in-cabin life detection and driver monitoring systems, with a particular emphasis on drowsiness detection. To meet these requirements, several techniques employ artificial intelligence methods based on real-time vital sign measurements. In parallel, Frequency-Modulated Continuous-Wave (FMCW) radar technology has garnered considerable attention in the domains of healthcare and biomedical engineering for non-invasive vital sign monitoring. FMCW radar offers a multitude of advantages, including its non-intrusive nature, continuous monitoring capacity, and its ability to penetrate through clothing. In this paper, we propose a system utilizing the AWR6843AOP radar from Texas Instruments (TI) to extract precise vital sign information. The radar allows us to estimate Ballistocardiogram (BCG) signals, which capture the mechanical movements of the body, particularly the ballistic forces generated by heartbeats and respiration. These signals are rich sources of information about the cardiac cycle, rendering them suitable for heart rate estimation. The process begins with real-time subject positioning, followed by clutter removal, computation of Doppler phase differences, and the use of various filtering methods to accurately capture subtle physiological movements. To address the challenges associated with FMCW radar-based vital sign monitoring, including motion artifacts due to subjects' movement or radar micro-vibrations, Long Short-Term Memory (LSTM) networks are implemented. LSTM's adaptability to different heart rate patterns and ability to handle real-time data make it suitable for continuous monitoring applications. Several crucial steps were taken, including feature extraction (involving amplitude, time intervals, and signal morphology), sequence modeling, heart rate estimation through the analysis of detected cardiac cycles and their temporal relationships, and performance evaluation using metrics such as Root Mean Square Error (RMSE) and correlation with reference heart rate measurements. For dataset construction and LSTM training, a comprehensive data collection system was established, integrating the AWR6843AOP radar, a Heart Rate Belt, and a smart watch for ground truth measurements. Rigorous synchronization of these devices ensured data accuracy. Twenty participants engaged in various scenarios, encompassing indoor and real-world conditions within a moving vehicle equipped with the radar system. Static and dynamic subject’s conditions were considered. The heart rate estimation through LSTM outperforms traditional signal processing techniques that rely on filtering, Fast Fourier Transform (FFT), and thresholding. It delivers an average accuracy of approximately 91% with an RMSE of 1.01 beat per minute (bpm). In conclusion, this paper underscores the promising potential of FMCW radar technology integrated with artificial intelligence algorithms in the context of automotive applications. This innovation not only enhances road safety but also paves the way for its integration into the automotive ecosystem to improve driver well-being and overall vehicular safety.

Keywords: ballistocardiogram, FMCW Radar, vital sign monitoring, LSTM

Procedia PDF Downloads 72

6 The Use of Rule-Based Cellular Automata to Track and Forecast the Dispersal of Classical Biocontrol Agents at Scale, with an Application to the Fopius arisanus Fruit Fly Parasitoid

Authors: Agboka Komi Mensah, John Odindi, Elfatih M. Abdel-Rahman, Onisimo Mutanga, Henri Ez Tonnang

Abstract:

Ecosystems are networks of organisms and populations that form a community of various species interacting within their habitats. Such habitats are defined by abiotic and biotic conditions that establish the initial limits to a population's growth, development, and reproduction. The habitat’s conditions explain the context in which species interact to access resources such as food, water, space, shelter, and mates, allowing for feeding, dispersal, and reproduction. Dispersal is an essential life-history strategy that affects gene flow, resource competition, population dynamics, and species distributions. Despite the importance of dispersal in population dynamics and survival, understanding the mechanism underpinning the dispersal of organisms remains challenging. For instance, when an organism moves into an ecosystem for survival and resource competition, its progression is highly influenced by extrinsic factors such as its physiological state, climatic variables and ability to evade predation. Therefore, greater spatial detail is necessary to understand organism dispersal dynamics. Understanding organisms dispersal can be addressed using empirical and mechanistic modelling approaches, with the adopted approach depending on the study's purpose Cellular automata (CA) is an example of these approaches that have been successfully used in biological studies to analyze the dispersal of living organisms. Cellular automata can be briefly described as occupied cells by an individual that evolves based on proper decisions based on a set of neighbours' rules. However, in the ambit of modelling individual organisms dispersal at the landscape scale, we lack user friendly tools that do not require expertise in mathematical models and computing ability; such as a visual analytics framework for tracking and forecasting the dispersal behaviour of organisms. The term "visual analytics" (VA) describes a semiautomated approach to electronic data processing that is guided by users who can interact with data via an interface. Essentially, VA converts large amounts of quantitative or qualitative data into graphical formats that can be customized based on the operator's needs. Additionally, this approach can be used to enhance the ability of users from various backgrounds to understand data, communicate results, and disseminate information across a wide range of disciplines. To support effective analysis of the dispersal of organisms at the landscape scale, we therefore designed Pydisp which is a free visual data analytics tool for spatiotemporal dispersal modeling built in Python. Its user interface allows users to perform a quick and interactive spatiotemporal analysis of species dispersal using bioecological and climatic data. Pydisp enables reuse and upgrade through the use of simple principles such as Fuzzy cellular automata algorithms. The potential of dispersal modeling is demonstrated in a case study by predicting the dispersal of Fopius arisanus (Sonan), endoparasitoids to control Bactrocera dorsalis (Hendel) (Diptera: Tephritidae) in Kenya. The results obtained from our example clearly illustrate the parasitoid's dispersal process at the landscape level and confirm that dynamic processes in an agroecosystem are better understood when designed using mechanistic modelling approaches. Furthermore, as demonstrated in the example, the built software is highly effective in portraying the dispersal of organisms despite the unavailability of detailed data on the species dispersal mechanisms.

Keywords: cellular automata, fuzzy logic, landscape, spatiotemporal

Procedia PDF Downloads 77

5 Speeding Up Lenia: A Comparative Study Between Existing Implementations and CUDA C++ with OpenGL Interop

Authors: L. Diogo, A. Legrand, J. Nguyen-Cao, J. Rogeau, S. Bornhofen

Abstract:

Lenia is a system of cellular automata with continuous states, space and time, which surprises not only with the emergence of interesting life-like structures but also with its beauty. This paper reports ongoing research on a GPU implementation of Lenia using CUDA C++ and OpenGL Interoperability. We demonstrate how CUDA as a low-level GPU programming paradigm allows optimizing performance and memory usage of the Lenia algorithm. A comparative analysis through experimental runs with existing implementations shows that the CUDA implementation outperforms the others by one order of magnitude or more. Cellular automata hold significant interest due to their ability to model complex phenomena in systems with simple rules and structures. They allow exploring emergent behavior such as self-organization and adaptation, and find applications in various fields, including computer science, physics, biology, and sociology. Unlike classic cellular automata which rely on discrete cells and values, Lenia generalizes the concept of cellular automata to continuous space, time and states, thus providing additional fluidity and richness in emerging phenomena. In the current literature, there are many implementations of Lenia utilizing various programming languages and visualization libraries. However, each implementation also presents certain drawbacks, which serve as motivation for further research and development. In particular, speed is a critical factor when studying Lenia, for several reasons. Rapid simulation allows researchers to observe the emergence of patterns and behaviors in more configurations, on bigger grids and over longer periods without annoying waiting times. Thereby, they enable the exploration and discovery of new species within the Lenia ecosystem more efficiently. Moreover, faster simulations are beneficial when we include additional time-consuming algorithms such as computer vision or machine learning to evolve and optimize specific Lenia configurations. We developed a Lenia implementation for GPU using the C++ and CUDA programming languages, and CUDA/OpenGL Interoperability for immediate rendering. The goal of our experiment is to benchmark this implementation compared to the existing ones in terms of speed, memory usage, configurability and scalability. In our comparison we focus on the most important Lenia implementations, selected for their prominence, accessibility and widespread use in the scientific community. The implementations include MATLAB, JavaScript, ShaderToy GLSL, Jupyter, Rust and R. The list is not exhaustive but provides a broad view of the principal current approaches and their respective strengths and weaknesses. Our comparison primarily considers computational performance and memory efficiency, as these factors are critical for large-scale simulations, but we also investigate the ease of use and configurability. The experimental runs conducted so far demonstrate that the CUDA C++ implementation outperforms the other implementations by one order of magnitude or more. The benefits of using the GPU become apparent especially with larger grids and convolution kernels. However, our research is still ongoing. We are currently exploring the impact of several software design choices and optimization techniques, such as convolution with Fast Fourier Transforms (FFT), various GPU memory management scenarios, and the trade-off between speed and accuracy using single versus double precision floating point arithmetic. The results will give valuable insights into the practice of parallel programming of the Lenia algorithm, and all conclusions will be thoroughly presented in the conference paper. The final version of our CUDA C++ implementation will be published on github and made freely accessible to the Alife community for further development.

Keywords: artificial life, cellular automaton, GPU optimization, Lenia, comparative analysis.

Procedia PDF Downloads 41

4 Leveraging Digital Transformation Initiatives and Artificial Intelligence to Optimize Readiness and Simulate Mission Performance across the Fleet

Authors: Justin Woulfe

Abstract:

Siloed logistics and supply chain management systems throughout the Department of Defense (DOD) has led to disparate approaches to modeling and simulation (M&S), a lack of understanding of how one system impacts the whole, and issues with “optimal” solutions that are good for one organization but have dramatic negative impacts on another. Many different systems have evolved to try to understand and account for uncertainty and try to reduce the consequences of the unknown. As the DoD undertakes expansive digital transformation initiatives, there is an opportunity to fuse and leverage traditionally disparate data into a centrally hosted source of truth. With a streamlined process incorporating machine learning (ML) and artificial intelligence (AI), advanced M&S will enable informed decisions guiding program success via optimized operational readiness and improved mission success. One of the current challenges is to leverage the terabytes of data generated by monitored systems to provide actionable information for all levels of users. The implementation of a cloud-based application analyzing data transactions, learning and predicting future states from current and past states in real-time, and communicating those anticipated states is an appropriate solution for the purposes of reduced latency and improved confidence in decisions. Decisions made from an ML and AI application combined with advanced optimization algorithms will improve the mission success and performance of systems, which will improve the overall cost and effectiveness of any program. The Systecon team constructs and employs model-based simulations, cutting across traditional silos of data, aggregating maintenance, and supply data, incorporating sensor information, and applying optimization and simulation methods to an as-maintained digital twin with the ability to aggregate results across a system’s lifecycle and across logical and operational groupings of systems. This coupling of data throughout the enterprise enables tactical, operational, and strategic decision support, detachable and deployable logistics services, and configuration-based automated distribution of digital technical and product data to enhance supply and logistics operations. As a complete solution, this approach significantly reduces program risk by allowing flexible configuration of data, data relationships, business process workflows, and early test and evaluation, especially budget trade-off analyses. A true capability to tie resources (dollars) to weapon system readiness in alignment with the real-world scenarios a warfighter may experience has been an objective yet to be realized to date. By developing and solidifying an organic capability to directly relate dollars to readiness and to inform the digital twin, the decision-maker is now empowered through valuable insight and traceability. This type of educated decision-making provides an advantage over the adversaries who struggle with maintaining system readiness at an affordable cost. The M&S capability developed allows program managers to independently evaluate system design and support decisions by quantifying their impact on operational availability and operations and support cost resulting in the ability to simultaneously optimize readiness and cost. This will allow the stakeholders to make data-driven decisions when trading cost and readiness throughout the life of the program. Finally, sponsors are available to validate product deliverables with efficiency and much higher accuracy than in previous years.

Keywords: artificial intelligence, digital transformation, machine learning, predictive analytics

Procedia PDF Downloads 160

3 Location3: A Location Scouting Platform for the Support of Film and Multimedia Industries

Authors: Dimitrios Tzilopoulos, Panagiotis Symeonidis, Michael Loufakis, Dimosthenis Ioannidis, Dimitrios Tzovaras

Abstract:

The domestic film industry in Greece has traditionally relied heavily on state support. While film productions are crucial for the country's economy, it has not fully capitalized on attracting and promoting foreign productions. The lack of motivation, organized state support for attraction and licensing, and the absence of location scouting have hindered its potential. Although recent legislative changes have addressed the first two of these issues, the development of a comprehensive location database and a search engine that would effectively support location scouting at the pre-production location scouting is still in its early stages. In addition to the expected benefits of the film, television, marketing, and multimedia industries, a location-scouting service platform has the potential to yield significant financial gains locally and nationally. By promoting featured places like cultural and archaeological sites, natural monuments, and attraction points for visitors, it plays a vital role in both cultural promotion and facilitating tourism development. This study introduces LOCATION3, an internet platform revolutionizing film production location management. It interconnects location providers, film crews, and multimedia stakeholders, offering a comprehensive environment for seamless collaboration. The platform's central geodatabase (PostgreSQL) stores each location’s attributes, while web technologies like HTML, JavaScript, CSS, React.js, and Redux power the user-friendly interface. Advanced functionalities, utilizing deep learning models, developed in Python, are integrated via Node.js. Visual data presentation is achieved using the JS Leaflet library, delivering an interactive map experience. LOCATION3 sets a new standard, offering a range of essential features to enhance the management of film production locations. Firstly, it empowers users to effortlessly upload audiovisual material enriched with geospatial and temporal data, such as location coordinates, photographs, videos, 360-degree panoramas, and 3D location models. With the help of cutting-edge deep learning algorithms, the application automatically tags these materials, while users can also manually tag them. Moreover, the application allows users to record locations directly through its user-friendly mobile application. Users can then embark on seamless location searches, employing spatial or descriptive criteria. This intelligent search functionality considers a combination of relevant tags, dominant colors, architectural characteristics, emotional associations, and unique location traits. One of the application's standout features is the ability to explore locations by their visual similarity to other materials, facilitated by a reverse image search. Also, the interactive map serves as both a dynamic display for locations and a versatile filter, adapting to the user's preferences and effortlessly enhancing location searches. To further streamline the process, the application facilitates the creation of location lightboxes, enabling users to efficiently organize and share their content via email. Going above and beyond location management, the platform also provides invaluable liaison, matchmaking, and online marketplace services. This powerful functionality bridges the gap between visual and three-dimensional geospatial material providers, local agencies, film companies, production companies, etc. so that those interested in a specific location can access additional material beyond what is stored on the platform, as well as access production services supporting the functioning and completion of productions in a location (equipment provision, transportation, catering, accommodation, etc.).

Keywords: deep learning models, film industry, geospatial data management, location scouting

Procedia PDF Downloads 71

2 Unleashing Potential in Pedagogical Innovation for STEM Education: Applying Knowledge Transfer Technology to Guide a Co-Creation Learning Mechanism for the Lingering Effects Amid COVID-19

Authors: Lan Cheng, Harry Qin, Yang Wang

Abstract:

Background: COVID-19 has induced the largest digital learning experiment in history. There is also emerging research evidence that students have paid a high cost of learning loss from virtual learning. University-wide survey results demonstrate that digital learning remains difficult for students who struggle with learning challenges, isolation, or a lack of resources. Large-scale efforts are therefore increasingly utilized for digital education. To better prepare students in higher education for this grand scientific and technological transformation, STEM education has been prioritized and promoted as a strategic imperative in the ongoing curriculum reform essential for unfinished learning needs and whole-person development. Building upon five key elements identified in the STEM education literature: Problem-based Learning, Community and Belonging, Technology Skills, Personalization of Learning, Connection to the External Community, this case study explores the potential of pedagogical innovation that integrates computational and experimental methodologies to support, enrich, and navigate STEM education. Objectives: The goal of this case study is to create a high-fidelity prototype design for STEM education with knowledge transfer technology that contains a Cooperative Multi-Agent System (CMAS), which has the objectives of (1) conduct assessment to reveal a virtual learning mechanism and establish strategies to facilitate scientific learning engagement, accessibility, and connection within and beyond university setting, (2) explore and validate an interactional co-creation approach embedded in project-based learning activities under the STEM learning context, which is being transformed by both digital technology and student behavior change,(3) formulate and implement the STEM-oriented campaign to guide learning network mapping, mitigate the loss of learning, enhance the learning experience, scale-up inclusive participation. Methods: This study applied a case study strategy and a methodology informed by Social Network Analysis Theory within a cross-disciplinary communication paradigm (students, peers, educators). Knowledge transfer technology is introduced to address learning challenges and to increase the efficiency of Reinforcement Learning (RL) algorithms. A co-creation learning framework was identified and investigated in a context-specific way with a learning analytic tool designed in this study. Findings: The result shows that (1) CMAS-empowered learning support reduced students’ confusion, difficulties, and gaps during problem-solving scenarios while increasing learner capacity empowerment, (2) The co-creation learning phenomenon have examined through the lens of the campaign and reveals that an interactive virtual learning environment fosters students to navigate scientific challenge independently and collaboratively, (3) The deliverables brought from the STEM educational campaign provide a methodological framework both within the context of the curriculum design and external community engagement application. Conclusion: This study brings a holistic and coherent pedagogy to cultivates students’ interest in STEM and develop them a knowledge base to integrate and apply knowledge across different STEM disciplines. Through the co-designing and cross-disciplinary educational content and campaign promotion, findings suggest factors to empower evidence-based learning practice while also piloting and tracking the impact of the scholastic value of co-creation under the dynamic learning environment. The data nested under the knowledge transfer technology situates learners’ scientific journey and could pave the way for theoretical advancement and broader scientific enervators within larger datasets, projects, and communities.

Keywords: co-creation, cross-disciplinary, knowledge transfer, STEM education, social network analysis

Procedia PDF Downloads 114

1 Times2D: A Time-Frequency Method for Time Series Forecasting

Authors: Reza Nematirad, Anil Pahwa, Balasubramaniam Natarajan

Abstract:

Time series data consist of successive data points collected over a period of time. Accurate prediction of future values is essential for informed decision-making in several real-world applications, including electricity load demand forecasting, lifetime estimation of industrial machinery, traffic planning, weather prediction, and the stock market. Due to their critical relevance and wide application, there has been considerable interest in time series forecasting in recent years. However, the proliferation of sensors and IoT devices, real-time monitoring systems, and high-frequency trading data introduce significant intricate temporal variations, rapid changes, noise, and non-linearities, making time series forecasting more challenging. Classical methods such as Autoregressive integrated moving average (ARIMA) and Exponential Smoothing aim to extract pre-defined temporal variations, such as trends and seasonality. While these methods are effective for capturing well-defined seasonal patterns and trends, they often struggle with more complex, non-linear patterns present in real-world time series data. In recent years, deep learning has made significant contributions to time series forecasting. Recurrent Neural Networks (RNNs) and their variants, such as Long short-term memory (LSTMs) and Gated Recurrent Units (GRUs), have been widely adopted for modeling sequential data. However, they often suffer from the locality, making it difficult to capture local trends and rapid fluctuations. Convolutional Neural Networks (CNNs), particularly Temporal Convolutional Networks (TCNs), leverage convolutional layers to capture temporal dependencies by applying convolutional filters along the temporal dimension. Despite their advantages, TCNs struggle with capturing relationships between distant time points due to the locality of one-dimensional convolution kernels. Transformers have revolutionized time series forecasting with their powerful attention mechanisms, effectively capturing long-term dependencies and relationships between distant time points. However, the attention mechanism may struggle to discern dependencies directly from scattered time points due to intricate temporal patterns. Lastly, Multi-Layer Perceptrons (MLPs) have also been employed, with models like N-BEATS and LightTS demonstrating success. Despite this, MLPs often face high volatility and computational complexity challenges in long-horizon forecasting. To address intricate temporal variations in time series data, this study introduces Times2D, a novel framework that parallelly integrates 2D spectrogram and derivative heatmap techniques. The spectrogram focuses on the frequency domain, capturing periodicity, while the derivative patterns emphasize the time domain, highlighting sharp fluctuations and turning points. This 2D transformation enables the utilization of powerful computer vision techniques to capture various intricate temporal variations. To evaluate the performance of Times2D, extensive experiments were conducted on standard time series datasets and compared with various state-of-the-art algorithms, including DLinear (2023), TimesNet (2023), Non-stationary Transformer (2022), PatchTST (2023), N-HiTS (2023), Crossformer (2023), MICN (2023), LightTS (2022), FEDformer (2022), FiLM (2022), SCINet (2022a), Autoformer (2021), and Informer (2021) under the same modeling conditions. The initial results demonstrated that Times2D achieves consistent state-of-the-art performance in both short-term and long-term forecasting tasks. Furthermore, the generality of the Times2D framework allows it to be applied to various tasks such as time series imputation, clustering, classification, and anomaly detection, offering potential benefits in any domain that involves sequential data analysis.

Keywords: derivative patterns, spectrogram, time series forecasting, times2D, 2D representation

Procedia PDF Downloads 42