Search results for: sesimic data processing
Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 26968

Search results for: sesimic data processing

26038 Simulation and Hardware Implementation of Data Communication Between CAN Controllers for Automotive Applications

Authors: R. M. Kalayappan, N. Kathiravan

Abstract:

In automobile industries, Controller Area Network (CAN) is widely used to reduce the system complexity and inter-task communication. Therefore, this paper proposes the hardware implementation of data frame communication between one controller to other. The CAN data frames and protocols will be explained deeply, here. The data frames are transferred without any collision or corruption. The simulation is made in the KEIL vision software to display the data transfer between transmitter and receiver in CAN. ARM7 micro-controller is used to transfer data’s between the controllers in real time. Data transfer is verified using the CRO.

Keywords: control area network (CAN), automotive electronic control unit, CAN 2.0, industry

Procedia PDF Downloads 396
26037 Improving the Statistics Nature in Research Information System

Authors: Rajbir Cheema

Abstract:

In order to introduce an integrated research information system, this will provide scientific institutions with the necessary information on research activities and research results in assured quality. Since data collection, duplication, missing values, incorrect formatting, inconsistencies, etc. can arise in the collection of research data in different research information systems, which can have a wide range of negative effects on data quality, the subject of data quality should be treated with better results. This paper examines the data quality problems in research information systems and presents the new techniques that enable organizations to improve their quality of research information.

Keywords: Research information systems (RIS), research information, heterogeneous sources, data quality, data cleansing, science system, standardization

Procedia PDF Downloads 151
26036 Data Mining Meets Educational Analysis: Opportunities and Challenges for Research

Authors: Carla Silva

Abstract:

Recent development of information and communication technology enables us to acquire, collect, analyse data in various fields of socioeconomic – technological systems. Along with the increase of economic globalization and the evolution of information technology, data mining has become an important approach for economic data analysis. As a result, there has been a critical need for automated approaches to effective and efficient usage of massive amount of educational data, in order to support institutions to a strategic planning and investment decision-making. In this article, we will address data from several different perspectives and define the applied data to sciences. Many believe that 'big data' will transform business, government, and other aspects of the economy. We discuss how new data may impact educational policy and educational research. Large scale administrative data sets and proprietary private sector data can greatly improve the way we measure, track, and describe educational activity and educational impact. We also consider whether the big data predictive modeling tools that have emerged in statistics and computer science may prove useful in educational and furthermore in economics. Finally, we highlight a number of challenges and opportunities for future research.

Keywords: data mining, research analysis, investment decision-making, educational research

Procedia PDF Downloads 352
26035 Preserving Urban Cultural Heritage with Deep Learning: Color Planning for Japanese Merchant Towns

Authors: Dongqi Li, Yunjia Huang, Tomo Inoue, Kohei Inoue

Abstract:

With urbanization, urban cultural heritage is facing the impact and destruction of modernization and urbanization. Many historical areas are losing their historical information and regional cultural characteristics, so it is necessary to carry out systematic color planning for historical areas in conservation. As an early focus on urban color planning, Japan has a systematic approach to urban color planning. Hence, this paper selects five merchant towns from the category of important traditional building preservation areas in Japan as the subject of this study to explore the color structure and emotion of this type of historic area. First, the image semantic segmentation method identifies the buildings, roads, and landscape environments. Their color data were extracted for color composition and emotion analysis to summarize their common features. Second, the obtained Internet evaluations were extracted by natural language processing for keyword extraction. The correlation analysis of the color structure and keywords provides a valuable reference for conservation decisions for this historic area in the town. This paper also combines the color structure and Internet evaluation results with generative adversarial networks to generate predicted images of color structure improvements and color improvement schemes. The methods and conclusions of this paper can provide new ideas for the digital management of environmental colors in historic districts and provide a valuable reference for the inheritance of local traditional culture.

Keywords: historic districts, color planning, semantic segmentation, natural language processing

Procedia PDF Downloads 80
26034 The Effect of the Internal Organization Communications' Effectiveness through Employee's Performance of Faculty of Management Science, Suan Sunandha Rajabhat University

Authors: Malaiphan Pansap, Surasit Vithayarat

Abstract:

The purpose of this study was to study the relationship between internal organization communications’ effectiveness and employee’s performance of Faculty of Management Science, Suan Sunandha Rajabhat University. Study on solutions of communication were carried out within the organization. Questionnaire was used to collect information from 136 people of staff and instructor and data were analyzed by using frequency, percentage, mean and standard deviation and then data processing statistic programs. The result found that organization communication that affects their employee’s performance is sender which lack the skills for speaking and writing to convince audiences ready before taking message and the message which organizations are not always informed. The employees believe the behavior of good organization communication has a positive impact on the development of organization because the employees feel involved and be a part of the organization, by the cooperation in working to achieve the goal, the employees can work in the same direction and meet goal quickly.

Keywords: employee’s performance, faculty of management science, internal organization communications’ effectiveness, management accounting, Suan Sunandha Rajabhat University

Procedia PDF Downloads 234
26033 Preprocessing and Fusion of Multiple Representation of Finger Vein patterns using Conventional and Machine Learning techniques

Authors: Tomas Trainys, Algimantas Venckauskas

Abstract:

Application of biometric features to the cryptography for human identification and authentication is widely studied and promising area of the development of high-reliability cryptosystems. Biometric cryptosystems typically are designed for patterns recognition, which allows biometric data acquisition from an individual, extracts feature sets, compares the feature set against the set stored in the vault and gives a result of the comparison. Preprocessing and fusion of biometric data are the most important phases in generating a feature vector for key generation or authentication. Fusion of biometric features is critical for achieving a higher level of security and prevents from possible spoofing attacks. The paper focuses on the tasks of initial processing and fusion of multiple representations of finger vein modality patterns. These tasks are solved by applying conventional image preprocessing methods and machine learning techniques, Convolutional Neural Network (SVM) method for image segmentation and feature extraction. An article presents a method for generating sets of biometric features from a finger vein network using several instances of the same modality. Extracted features sets were fused at the feature level. The proposed method was tested and compared with the performance and accuracy results of other authors.

Keywords: bio-cryptography, biometrics, cryptographic key generation, data fusion, information security, SVM, pattern recognition, finger vein method.

Procedia PDF Downloads 145
26032 Enhancing Large Language Models' Data Analysis Capability with Planning-and-Execution and Code Generation Agents: A Use Case for Southeast Asia Real Estate Market Analytics

Authors: Kien Vu, Jien Min Soh, Mohamed Jahangir Abubacker, Piyawut Pattamanon, Soojin Lee, Suvro Banerjee

Abstract:

Recent advances in Generative Artificial Intelligence (GenAI), in particular Large Language Models (LLMs) have shown promise to disrupt multiple industries at scale. However, LLMs also present unique challenges, notably, these so-called "hallucination" which is the generation of outputs that are not grounded in the input data that hinders its adoption into production. Common practice to mitigate hallucination problem is utilizing Retrieval Agmented Generation (RAG) system to ground LLMs'response to ground truth. RAG converts the grounding documents into embeddings, retrieve the relevant parts with vector similarity between user's query and documents, then generates a response that is not only based on its pre-trained knowledge but also on the specific information from the retrieved documents. However, the RAG system is not suitable for tabular data and subsequent data analysis tasks due to multiple reasons such as information loss, data format, and retrieval mechanism. In this study, we have explored a novel methodology that combines planning-and-execution and code generation agents to enhance LLMs' data analysis capabilities. The approach enables LLMs to autonomously dissect a complex analytical task into simpler sub-tasks and requirements, then convert them into executable segments of code. In the final step, it generates the complete response from output of the executed code. When deployed beta version on DataSense, the property insight tool of PropertyGuru, the approach yielded promising results, as it was able to provide market insights and data visualization needs with high accuracy and extensive coverage by abstracting the complexities for real-estate agents and developers from non-programming background. In essence, the methodology not only refines the analytical process but also serves as a strategic tool for real estate professionals, aiding in market understanding and enhancement without the need for programming skills. The implication extends beyond immediate analytics, paving the way for a new era in the real estate industry characterized by efficiency and advanced data utilization.

Keywords: large language model, reasoning, planning and execution, code generation, natural language processing, prompt engineering, data analysis, real estate, data sense, PropertyGuru

Procedia PDF Downloads 76
26031 A Crowdsourced Homeless Data Collection System and its Econometric Analysis: Strengthening Inclusive Public Administration Policies

Authors: Praniil Nagaraj

Abstract:

This paper proposes a method to collect homeless data using crowdsourcing and presents an approach to analyze the data, demonstrating its potential to strengthen existing and future policies aimed at promoting socio-economic equilibrium. The 2023 Annual Homeless Assessment Report (AHAR) to Congress highlighted alarming statistics, emphasizing the need for effective decisionmaking and budget allocation within local planning bodies known as Continuums of Care (CoC). This paper's contributions can be categorized into three main areas. Firstly, a unique method for collecting homeless data is introduced, utilizing a user-friendly smartphone app (currently available for Android). The app enables the general public to quickly record information about homeless individuals, including the number of people and details about their living conditions. The collected data, including date, time, and location, is anonymized and securely transmitted to the cloud. It is anticipated that an increasing number of users motivated to contribute to society will adopt the app, thus expanding the data collection efforts. Duplicate data is addressed through simple classification methods, and historical data is utilized to fill in missing information. The second contribution of this paper is the description of data analysis techniques applied to the collected data. By combining this new data with existing information, statistical regression analysis is employed to gain insights into various aspects, such as distinguishing between unsheltered and sheltered homeless populations, as well as examining their correlation with factors like unemployment rates, housing affordability, and labor demand. Initial data is collected in San Francisco, while pre-existing information is drawn from three cities: San Francisco, New York City, and Washington D.C., facilitating the conduction of simulations. The third contribution focuses on demonstrating the practical implications of the data processing results. The challenges faced by key stakeholders, including charitable organizations and local city governments, are taken into consideration. Two case studies are presented as examples. The first case study explores improving the efficiency of food and necessities distribution, as well as medical assistance, driven by charitable organizations. The second case study examines the correlation between micro-geographic budget expenditure by local city governments and homeless information to justify budget allocation and expenditures. The ultimate objective of this endeavor is to enable the continuous enhancement of the quality of life for the underprivileged. It is hoped that through increased crowdsourcing of data from the public, the Generosity Curve and the Need Curve will intersect, leading to a better world for all.

Keywords: crowdsourcing, homelessness, socio-economic policies, statistical analysis

Procedia PDF Downloads 19
26030 Polarity Classification of Social Media Comments in Turkish

Authors: Migena Ceyhan, Zeynep Orhan, Dimitrios Karras

Abstract:

People in modern societies are continuously sharing their experiences, emotions, and thoughts in different areas of life. The information reaches almost everyone in real-time and can have an important impact in shaping people’s way of living. This phenomenon is very well recognized and advantageously used by the market representatives, trying to earn the most from this means. Given the abundance of information, people and organizations are looking for efficient tools that filter the countless data into important information, ready to analyze. This paper is a modest contribution in this field, describing the process of automatically classifying social media comments in the Turkish language into positive or negative. Once data is gathered and preprocessed, feature sets of selected single words or groups of words are build according to the characteristics of language used in the texts. These features are used later to train, and test a system according to different machine learning algorithms (Naïve Bayes, Sequential Minimal Optimization, J48, and Bayesian Linear Regression). The resultant high accuracies can be important feedback for decision-makers to improve the business strategies accordingly.

Keywords: feature selection, machine learning, natural language processing, sentiment analysis, social media reviews

Procedia PDF Downloads 143
26029 Testing Chat-GPT: An AI Application

Authors: Jana Ismail, Layla Fallatah, Maha Alshmaisi

Abstract:

ChatGPT, a cutting-edge language model built on the GPT-3.5 architecture, has garnered attention for its profound natural language processing capabilities, holding promise for transformative applications in customer service and content creation. This study delves into ChatGPT's architecture, aiming to comprehensively understand its strengths and potential limitations. Through systematic experiments across diverse domains, such as general knowledge and creative writing, we evaluated the model's coherence, context retention, and task-specific accuracy. While ChatGPT excels in generating human-like responses and demonstrates adaptability, occasional inaccuracies and sensitivity to input phrasing were observed. The study emphasizes the impact of prompt design on output quality, providing valuable insights for the nuanced deployment of ChatGPT in conversational AI and contributing to the ongoing discourse on the evolving landscape of natural language processing in artificial intelligence.

Keywords: artificial Inelegance, chatGPT, open AI, NLP

Procedia PDF Downloads 69
26028 Physicochemical and Sensorial Evaluation of Astringency Reduction in Cashew Apple (Annacardium occidentale L.) Powder Processing in Cookie Elaboration

Authors: Elida Gastelum-Martinez, Neith A. Pacheco-Lopez, Juan L. Morales-Landa

Abstract:

Cashew agroindustry obtained from cashew apple crop (Anacardium occidentale L.) generates large amounts of unused waste in Campeche, Mexico. Despite having a high content of nutritional compounds such as ascorbic acid, carotenoids, fiber, carbohydrates, and minerals, it is not consumed due to its astringent sensation. The aim of this work was to develop a processing method for cashew apple waste in order to obtain a powder with reduced astringency able to be used as an additive in the food industry. The processing method consisted first in reducing astringency by inducing tannins from cashew apple peel to react and form precipitating complexes with a colloid rich in proline and histidine. Then cashew apples were processed to obtain a dry powder. Astringency reduction was determined by total phenolic content and evaluated by sensorial analysis in cashew-apple-powder based cookies. Total phenolic content in processed powders showed up to 72% lower concentration compared to control samples. The sensorial evaluation indicated that cookies baked using cashew apple powder with reduced astringency were 96.8% preferred. Sensorial characteristics like texture, color and taste were also well-accepted attributes. In conclusion, the method applied for astringency reduction is a viable tool to produce cashew apple powder with desirable sensorial properties to be used in the development of food products.

Keywords: astringency reduction, cashew apple waste, food industry, sensorial evaluation

Procedia PDF Downloads 348
26027 A Method of Detecting the Difference in Two States of Brain Using Statistical Analysis of EEG Raw Data

Authors: Digvijaysingh S. Bana, Kiran R. Trivedi

Abstract:

This paper introduces various methods for the alpha wave to detect the difference between two states of brain. One healthy subject participated in the experiment. EEG was measured on the forehead above the eye (FP1 Position) with reference and ground electrode are on the ear clip. The data samples are obtained in the form of EEG raw data. The time duration of reading is of one minute. Various test are being performed on the alpha band EEG raw data.The readings are performed in different time duration of the entire day. The statistical analysis is being carried out on the EEG sample data in the form of various tests.

Keywords: electroencephalogram(EEG), biometrics, authentication, EEG raw data

Procedia PDF Downloads 457
26026 Development of an Optimization Method for Myoelectric Signal Processing by Active Matrix Sensing in Robot Rehabilitation

Authors: Noriyoshi Yamauchi, Etsuo Horikawa, Takunori Tsuji

Abstract:

Training by exoskeleton robot is drawing attention as a rehabilitation method for body paralysis seen in many cases, and there are many forms that assist with the myoelectric signal generated by exercise commands from the brain. Rehabilitation requires more frequent training, but it is one of the reasons that the technology is required for the identification of the myoelectric potential derivation site and attachment of the device is preventing the spread of paralysis. In this research, we focus on improving the efficiency of gait training by exoskeleton type robots, improvement of myoelectric acquisition and analysis method using active matrix sensing method, and improvement of walking rehabilitation and walking by optimization of robot control.

Keywords: active matrix sensing, brain machine interface (BMI), the central pattern generator (CPG), myoelectric signal processing, robot rehabilitation

Procedia PDF Downloads 384
26025 Relative Entropy Used to Determine the Divergence of Cells in Single Cell RNA Sequence Data Analysis

Authors: An Chengrui, Yin Zi, Wu Bingbing, Ma Yuanzhu, Jin Kaixiu, Chen Xiao, Ouyang Hongwei

Abstract:

Single cell RNA sequence (scRNA-seq) is one of the effective tools to study transcriptomics of biological processes. Recently, similarity measurement of cells is Euclidian distance or its derivatives. However, the process of scRNA-seq is a multi-variate Bernoulli event model, thus we hypothesize that it would be more efficient when the divergence between cells is valued with relative entropy than Euclidian distance. In this study, we compared the performances of Euclidian distance, Spearman correlation distance and Relative Entropy using scRNA-seq data of the early, medial and late stage of limb development generated in our lab. Relative Entropy is better than other methods according to cluster potential test. Furthermore, we developed KL-SNE, an algorithm modifying t-SNE whose definition of divergence between cells Euclidian distance to Kullback–Leibler divergence. Results showed that KL-SNE was more effective to dissect cell heterogeneity than t-SNE, indicating the better performance of relative entropy than Euclidian distance. Specifically, the chondrocyte expressing Comp was clustered together with KL-SNE but not with t-SNE. Surprisingly, cells in early stage were surrounded by cells in medial stage in the processing of KL-SNE while medial cells neighbored to late stage with the process of t-SNE. This results parallel to Heatmap which showed cells in medial stage were more heterogenic than cells in other stages. In addition, we also found that results of KL-SNE tend to follow Gaussian distribution compared with those of the t-SNE, which could also be verified with the analysis of scRNA-seq data from another study on human embryo development. Therefore, it is also an effective way to convert non-Gaussian distribution to Gaussian distribution and facilitate the subsequent statistic possesses. Thus, relative entropy is potentially a better way to determine the divergence of cells in scRNA-seq data analysis.

Keywords: Single cell RNA sequence, Similarity measurement, Relative Entropy, KL-SNE, t-SNE

Procedia PDF Downloads 336
26024 E4D-MP: Time-Lapse Multiphysics Simulation and Joint Inversion Toolset for Large-Scale Subsurface Imaging

Authors: Zhuanfang Fred Zhang, Tim C. Johnson, Yilin Fang, Chris E. Strickland

Abstract:

A variety of geophysical techniques are available to image the opaque subsurface with little or no contact with the soil. It is common to conduct time-lapse surveys of different types for a given site for improved results of subsurface imaging. Regardless of the chosen survey methods, it is often a challenge to process the massive amount of survey data. The currently available software applications are generally based on the one-dimensional assumption for a desktop personal computer. Hence, they are usually incapable of imaging the three-dimensional (3D) processes/variables in the subsurface of reasonable spatial scales; the maximum amount of data that can be inverted simultaneously is often very small due to the capability limitation of personal computers. Presently, high-performance or integrating software that enables real-time integration of multi-process geophysical methods is needed. E4D-MP enables the integration and inversion of time-lapsed large-scale data surveys from geophysical methods. Using the supercomputing capability and parallel computation algorithm, E4D-MP is capable of processing data across vast spatiotemporal scales and in near real time. The main code and the modules of E4D-MP for inverting individual or combined data sets of time-lapse 3D electrical resistivity, spectral induced polarization, and gravity surveys have been developed and demonstrated for sub-surface imaging. E4D-MP provides capability of imaging the processes (e.g., liquid or gas flow, solute transport, cavity development) and subsurface properties (e.g., rock/soil density, conductivity) critical for successful control of environmental engineering related efforts such as environmental remediation, carbon sequestration, geothermal exploration, and mine land reclamation, among others.

Keywords: gravity survey, high-performance computing, sub-surface monitoring, electrical resistivity tomography

Procedia PDF Downloads 148
26023 Wh-Movement in Second Language Acquisition: Evidence from Magnitude Estimation

Authors: Dong-Bo Hsu

Abstract:

Universal Grammar (UG) claims that the constraints that are derived from this should operate in language users’ L2 grammars. This study investigated this hypothesis on knowledge of Subjacency and resumptive pronoun usage among Chinese learners of English. Chinese fulfills two requirements to examine the existence of UG, i.e., Subjacency does not operate in Chinese and resumptive pronouns in English are very different from those in Chinese and second L2 input undermines the knowledge of Subjacency. The results indicated that Chinese learners of English demonstrated a nearly identical pattern as English native speakers do but the resumptive pronoun in the embedding clauses. This may be explained in terms of the case that Chinese speakers’ usage of pronouns is not influenced by the number of embedding clauses. Chinese learners of English have full access to knowledge endowed by UG but their processing of English sentences may be different from native speakers as a general slow rate for processing in their L2 English.

Keywords: universal grammar, Chinese, English, wh-questions, resumption

Procedia PDF Downloads 464
26022 Linguistic Analysis of Argumentation Structures in Georgian Political Speeches

Authors: Mariam Matiashvili

Abstract:

Argumentation is an integral part of our daily communications - formal or informal. Argumentative reasoning, techniques, and language tools are used both in personal conversations and in the business environment. Verbalization of the opinions requires the use of extraordinary syntactic-pragmatic structural quantities - arguments that add credibility to the statement. The study of argumentative structures allows us to identify the linguistic features that make the text argumentative. Knowing what elements make up an argumentative text in a particular language helps the users of that language improve their skills. Also, natural language processing (NLP) has become especially relevant recently. In this context, one of the main emphases is on the computational processing of argumentative texts, which will enable the automatic recognition and analysis of large volumes of textual data. The research deals with the linguistic analysis of the argumentative structures of Georgian political speeches - particularly the linguistic structure, characteristics, and functions of the parts of the argumentative text - claims, support, and attack statements. The research aims to describe the linguistic cues that give the sentence a judgmental/controversial character and helps to identify reasoning parts of the argumentative text. The empirical data comes from the Georgian Political Corpus, particularly TV debates. Consequently, the texts are of a dialogical nature, representing a discussion between two or more people (most often between a journalist and a politician). The research uses the following approaches to identify and analyze the argumentative structures Lexical Classification & Analysis - Identify lexical items that are relevant in argumentative texts creating process - Creating the lexicon of argumentation (presents groups of words gathered from a semantic point of view); Grammatical Analysis and Classification - means grammatical analysis of the words and phrases identified based on the arguing lexicon. Argumentation Schemas - Describe and identify the Argumentation Schemes that are most likely used in Georgian Political Speeches. As a final step, we analyzed the relations between the above mentioned components. For example, If an identified argument scheme is “Argument from Analogy”, identified lexical items semantically express analogy too, and they are most likely adverbs in Georgian. As a result, we created the lexicon with the words that play a significant role in creating Georgian argumentative structures. Linguistic analysis has shown that verbs play a crucial role in creating argumentative structures.

Keywords: georgian, argumentation schemas, argumentation structures, argumentation lexicon

Procedia PDF Downloads 66
26021 Influence of Building Orientation and Post Processing Materials on Mechanical Properties of 3D-Printed Parts

Authors: Raf E. Ul Shougat, Ezazul Haque Sabuz, G. M. Najmul Quader, Monon Mahboob

Abstract:

Since there are lots of ways for building and post processing of parts or models in 3D printing technology, the main objective of this research is to provide an understanding how mechanical characteristics of 3D printed parts get changed for different building orientations and infiltrates. Tensile, compressive, flexure, and hardness test were performed for the analysis of mechanical properties of those models. Specimens were designed in CAD software, printed on Z-printer 450 with five different build orientations and post processed with four different infiltrates. Results show that with the change of infiltrates or orientations each of the above mechanical property changes and for each infiltrate the highest tensile strength, flexural strength, and hardness are found for such orientation where there is the lowest number of layers while printing.

Keywords: 3D printing, building orientations, infiltrates, mechanical characteristics, number of layers

Procedia PDF Downloads 276
26020 Profiling Risky Code Using Machine Learning

Authors: Zunaira Zaman, David Bohannon

Abstract:

This study explores the application of machine learning (ML) for detecting security vulnerabilities in source code. The research aims to assist organizations with large application portfolios and limited security testing capabilities in prioritizing security activities. ML-based approaches offer benefits such as increased confidence scores, false positives and negatives tuning, and automated feedback. The initial approach using natural language processing techniques to extract features achieved 86% accuracy during the training phase but suffered from overfitting and performed poorly on unseen datasets during testing. To address these issues, the study proposes using the abstract syntax tree (AST) for Java and C++ codebases to capture code semantics and structure and generate path-context representations for each function. The Code2Vec model architecture is used to learn distributed representations of source code snippets for training a machine-learning classifier for vulnerability prediction. The study evaluates the performance of the proposed methodology using two datasets and compares the results with existing approaches. The Devign dataset yielded 60% accuracy in predicting vulnerable code snippets and helped resist overfitting, while the Juliet Test Suite predicted specific vulnerabilities such as OS-Command Injection, Cryptographic, and Cross-Site Scripting vulnerabilities. The Code2Vec model achieved 75% accuracy and a 98% recall rate in predicting OS-Command Injection vulnerabilities. The study concludes that even partial AST representations of source code can be useful for vulnerability prediction. The approach has the potential for automated intelligent analysis of source code, including vulnerability prediction on unseen source code. State-of-the-art models using natural language processing techniques and CNN models with ensemble modelling techniques did not generalize well on unseen data and faced overfitting issues. However, predicting vulnerabilities in source code using machine learning poses challenges such as high dimensionality and complexity of source code, imbalanced datasets, and identifying specific types of vulnerabilities. Future work will address these challenges and expand the scope of the research.

Keywords: code embeddings, neural networks, natural language processing, OS command injection, software security, code properties

Procedia PDF Downloads 100
26019 Big Data Applications for Transportation Planning

Authors: Antonella Falanga, Armando Cartenì

Abstract:

"Big data" refers to extremely vast and complex sets of data, encompassing extraordinarily large and intricate datasets that require specific tools for meaningful analysis and processing. These datasets can stem from diverse origins like sensors, mobile devices, online transactions, social media platforms, and more. The utilization of big data is pivotal, offering the chance to leverage vast information for substantial advantages across diverse fields, thereby enhancing comprehension, decision-making, efficiency, and fostering innovation in various domains. Big data, distinguished by its remarkable attributes of enormous volume, high velocity, diverse variety, and significant value, represent a transformative force reshaping the industry worldwide. Their pervasive impact continues to unlock new possibilities, driving innovation and advancements in technology, decision-making processes, and societal progress in an increasingly data-centric world. The use of these technologies is becoming more widespread, facilitating and accelerating operations that were once much more complicated. In particular, big data impacts across multiple sectors such as business and commerce, healthcare and science, finance, education, geography, agriculture, media and entertainment and also mobility and logistics. Within the transportation sector, which is the focus of this study, big data applications encompass a wide variety, spanning across optimization in vehicle routing, real-time traffic management and monitoring, logistics efficiency, reduction of travel times and congestion, enhancement of the overall transportation systems, but also mitigation of pollutant emissions contributing to environmental sustainability. Meanwhile, in public administration and the development of smart cities, big data aids in improving public services, urban planning, and decision-making processes, leading to more efficient and sustainable urban environments. Access to vast data reservoirs enables deeper insights, revealing hidden patterns and facilitating more precise and timely decision-making. Additionally, advancements in cloud computing and artificial intelligence (AI) have further amplified the potential of big data, enabling more sophisticated and comprehensive analyses. Certainly, utilizing big data presents various advantages but also entails several challenges regarding data privacy and security, ensuring data quality, managing and storing large volumes of data effectively, integrating data from diverse sources, the need for specialized skills to interpret analysis results, ethical considerations in data use, and evaluating costs against benefits. Addressing these difficulties requires well-structured strategies and policies to balance the benefits of big data with privacy, security, and efficient data management concerns. Building upon these premises, the current research investigates the efficacy and influence of big data by conducting an overview of the primary and recent implementations of big data in transportation systems. Overall, this research allows us to conclude that big data better provide to enhance rational decision-making for mobility choices and is imperative for adeptly planning and allocating investments in transportation infrastructures and services.

Keywords: big data, public transport, sustainable mobility, transport demand, transportation planning

Procedia PDF Downloads 55
26018 A Study on Big Data Analytics, Applications and Challenges

Authors: Chhavi Rana

Abstract:

The aim of the paper is to highlight the existing development in the field of big data analytics. Applications like bioinformatics, smart infrastructure projects, Healthcare, and business intelligence contain voluminous and incremental data, which is hard to organise and analyse and can be dealt with using the framework and model in this field of study. An organization's decision-making strategy can be enhanced using big data analytics and applying different machine learning techniques and statistical tools on such complex data sets that will consequently make better things for society. This paper reviews the current state of the art in this field of study as well as different application domains of big data analytics. It also elaborates on various frameworks in the process of Analysis using different machine-learning techniques. Finally, the paper concludes by stating different challenges and issues raised in existing research.

Keywords: big data, big data analytics, machine learning, review

Procedia PDF Downloads 77
26017 A Study on Big Data Analytics, Applications, and Challenges

Authors: Chhavi Rana

Abstract:

The aim of the paper is to highlight the existing development in the field of big data analytics. Applications like bioinformatics, smart infrastructure projects, healthcare, and business intelligence contain voluminous and incremental data which is hard to organise and analyse and can be dealt with using the framework and model in this field of study. An organisation decision-making strategy can be enhanced by using big data analytics and applying different machine learning techniques and statistical tools to such complex data sets that will consequently make better things for society. This paper reviews the current state of the art in this field of study as well as different application domains of big data analytics. It also elaborates various frameworks in the process of analysis using different machine learning techniques. Finally, the paper concludes by stating different challenges and issues raised in existing research.

Keywords: big data, big data analytics, machine learning, review

Procedia PDF Downloads 86
26016 Improved K-Means Clustering Algorithm Using RHadoop with Combiner

Authors: Ji Eun Shin, Dong Hoon Lim

Abstract:

Data clustering is a common technique used in data analysis and is used in many applications, such as artificial intelligence, pattern recognition, economics, ecology, psychiatry and marketing. K-means clustering is a well-known clustering algorithm aiming to cluster a set of data points to a predefined number of clusters. In this paper, we implement K-means algorithm based on MapReduce framework with RHadoop to make the clustering method applicable to large scale data. RHadoop is a collection of R packages that allow users to manage and analyze data with Hadoop. The main idea is to introduce a combiner as a function of our map output to decrease the amount of data needed to be processed by reducers. The experimental results demonstrated that K-means algorithm using RHadoop can scale well and efficiently process large data sets on commodity hardware. We also showed that our K-means algorithm using RHadoop with combiner was faster than regular algorithm without combiner as the size of data set increases.

Keywords: big data, combiner, K-means clustering, RHadoop

Procedia PDF Downloads 426
26015 Effects of Bilingual Education in the Teaching and Learning Practices in the Continuous Improvement and Development of k12 Program

Authors: Miriam Sebastian

Abstract:

This research focused on the effects of bilingual education as medium of instruction to the academic performance of selected intermediate students of Miriam’s Academy of Valenzuela Inc. . An experimental design was used, with language of instruction as the independent variable and the different literacy skills as dependent variables. The sample consisted of experimental students comprises of 30 students were exposed to bilingual education (Filipino and English) . They were given pretests and were divided into three groups: Monolingual Filipino, Monolingual English, and Bilingual. They were taught different literacy skills for eight weeks and were then administered the posttests. Data was analyzed and evaluated in the light of the central processing and script-dependent hypotheses. Based on the data, it can be inferred that monolingual instruction in either Filipino or English had a stronger effect on the students’ literacy skills compared to bilingual instruction. Moreover, mother tongue-based instruction, as compared to second-language instruction, had stronger effect on the preschoolers’ literacy skills. Such results have implications not only for mother tongue-based (MTB) but also for English as a second language (ESL) instruction in the country

Keywords: bilingualism, effects, monolingual, function, multilingual, mother tongue

Procedia PDF Downloads 122
26014 An Efficient Clustering Technique for Copy-Paste Attack Detection

Authors: N. Chaitawittanun, M. Munlin

Abstract:

Due to rapid advancement of powerful image processing software, digital images are easy to manipulate and modify by ordinary people. Lots of digital images are edited for a specific purpose and more difficult to distinguish form their original ones. We propose a clustering method to detect a copy-move image forgery of JPEG, BMP, TIFF, and PNG. The process starts with reducing the color of the photos. Then, we use the clustering technique to divide information of measuring data by Hausdorff Distance. The result shows that the purposed methods is capable of inspecting the image file and correctly identify the forgery.

Keywords: image detection, forgery image, copy-paste, attack detection

Procedia PDF Downloads 333
26013 Framework for Integrating Big Data and Thick Data: Understanding Customers Better

Authors: Nikita Valluri, Vatcharaporn Esichaikul

Abstract:

With the popularity of data-driven decision making on the rise, this study focuses on providing an alternative outlook towards the process of decision-making. Combining quantitative and qualitative methods rooted in the social sciences, an integrated framework is presented with a focus on delivering a much more robust and efficient approach towards the concept of data-driven decision-making with respect to not only Big data but also 'Thick data', a new form of qualitative data. In support of this, an example from the retail sector has been illustrated where the framework is put into action to yield insights and leverage business intelligence. An interpretive approach to analyze findings from both kinds of quantitative and qualitative data has been used to glean insights. Using traditional Point-of-sale data as well as an understanding of customer psychographics and preferences, techniques of data mining along with qualitative methods (such as grounded theory, ethnomethodology, etc.) are applied. This study’s final goal is to establish the framework as a basis for providing a holistic solution encompassing both the Big and Thick aspects of any business need. The proposed framework is a modified enhancement in lieu of traditional data-driven decision-making approach, which is mainly dependent on quantitative data for decision-making.

Keywords: big data, customer behavior, customer experience, data mining, qualitative methods, quantitative methods, thick data

Procedia PDF Downloads 153
26012 Processing Studies and Challenges Faced in Development of High-Pressure Titanium Alloy Cryogenic Gas Bottles

Authors: Bhanu Pant, Sanjay H. Upadhyay

Abstract:

Frequently, the upper stage of high-performance launch vehicles utilizes cryogenic tank-submerged pressurization gas bottles with high volume-to-weight efficiency to achieve a direct gain in the satellite payload. Titanium alloys, owing to their high specific strength coupled with excellent compatibility with various fluids, are the materials of choice for these applications. Amongst the Titanium alloys, there are two alloys suitable for cryogenic applications, namely Ti6Al4V-ELI and Ti5Al2.5Sn-ELI. The two-phase alpha-beta alloy Ti6Al4V-ELI is usable up to LOX temperature of 90K, while the single-phase alpha alloy Ti5Al2.5Sn-ELI can be used down to LHe temperature of 4 K. The high-pressure gas bottles submerged in the LH2 (20K) can store more amount of gas in as compared to those submerged in LOX (90K) bottles the same volume. Thus, the use of these alpha alloy gas bottles stored at 20K gives a distinct advantage with respect to the need for a lesser number of gas bottles to store the same amount of high-pressure gas, which in turn leads to a one-to-one advantage in the payload in the satellite. The cost advantage to the tune of 15000$/ kg of weight is saved in the upper stages, and, thereby, the satellite payload gain is expected by this change. However, the processing of alpha Ti5Al2.5Sn-ELI alloy gas bottles poses challenges due to the lower forgeability of the alloy and mode of qualification for the critical severe application environment. The present paper describes the processing and challenges/ solutions during the development of these advanced gas bottles for LH2 (20K) applications.

Keywords: titanium alloys, cryogenic gas bottles, alpha titanium alloy, alpha-beta titanium alloy

Procedia PDF Downloads 48
26011 The Effect of Fly Ash in Dewatering of Marble Processing Wastewaters

Authors: H. A. Taner, V. Önen

Abstract:

In the thermal power plants established to meet the energy need, lignite with low calorie and high ash content is used. Burning of these coals results in wastes such as fly ash, slag and flue gas. This constitutes a significant economic and environmental problems. However, fly ash can find evaluation opportunities in various sectors. In this study, the effectiveness of fly ash on suspended solid removal from marble processing wastewater containing high concentration of suspended solids was examined. Experiments were carried out for two different suspensions, marble and travertine. In the experiments, FeCl3, Al2(SO4)3 and anionic polymer A130 were used also to compare with fly ash. Coagulant/flocculant type/dosage, mixing time/speed and pH were the experimental parameters. The performances in the experimental studies were assessed with the change in the interface height during sedimentation resultant and turbidity values of treated water. The highest sedimentation efficiency was achieved with anionic flocculant. However, it was determined that fly ash can be used instead of FeCl3 and Al2(SO4)3 in the travertine plant as a coagulant.

Keywords: dewatering, flocculant, fly ash, marble plant wastewater

Procedia PDF Downloads 149
26010 Landsat Data from Pre Crop Season to Estimate the Area to Be Planted with Summer Crops

Authors: Valdir Moura, Raniele dos Anjos de Souza, Fernando Gomes de Souza, Jose Vagner da Silva, Jerry Adriani Johann

Abstract:

The estimate of the Area of Land to be planted with annual crops and its stratification by the municipality are important variables in crop forecast. Nowadays in Brazil, these information’s are obtained by the Brazilian Institute of Geography and Statistics (IBGE) and published under the report Assessment of the Agricultural Production. Due to the high cloud cover in the main crop growing season (October to March) it is difficult to acquire good orbital images. Thus, one alternative is to work with remote sensing data from dates before the crop growing season. This work presents the use of multitemporal Landsat data gathered on July and September (before the summer growing season) in order to estimate the area of land to be planted with summer crops in an area of São Paulo State, Brazil. Geographic Information Systems (GIS) and digital image processing techniques were applied for the treatment of the available data. Supervised and non-supervised classifications were used for data in digital number and reflectance formats and the multitemporal Normalized Difference Vegetation Index (NDVI) images. The objective was to discriminate the tracts with higher probability to become planted with summer crops. Classification accuracies were evaluated using a sampling system developed basically for this study region. The estimated areas were corrected using the error matrix derived from these evaluations. The classification techniques presented an excellent level according to the kappa index. The proportion of crops stratified by municipalities was derived by a field work during the crop growing season. These proportion coefficients were applied onto the area of land to be planted with summer crops (derived from Landsat data). Thus, it was possible to derive the area of each summer crop by the municipality. The discrepancies between official statistics and our results were attributed to the sampling and the stratification procedures. Nevertheless, this methodology can be improved in order to provide good crop area estimates using remote sensing data, despite the cloud cover during the growing season.

Keywords: area intended for summer culture, estimated area planted, agriculture, Landsat, planting schedule

Procedia PDF Downloads 143
26009 Open Data for e-Governance: Case Study of Bangladesh

Authors: Sami Kabir, Sadek Hossain Khoka

Abstract:

Open Government Data (OGD) refers to all data produced by government which are accessible in reusable way by common people with access to Internet and at free of cost. In line with “Digital Bangladesh” vision of Bangladesh government, the concept of open data has been gaining momentum in the country. Opening all government data in digital and customizable format from single platform can enhance e-governance which will make government more transparent to the people. This paper presents a well-in-progress case study on OGD portal by Bangladesh Government in order to link decentralized data. The initiative is intended to facilitate e-service towards citizens through this one-stop web portal. The paper further discusses ways of collecting data in digital format from relevant agencies with a view to making it publicly available through this single point of access. Further, possible layout of this web portal is presented.

Keywords: e-governance, one-stop web portal, open government data, reusable data, web of data

Procedia PDF Downloads 346