Search results for: symbolic data analysis
Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 40780

Search results for: symbolic data analysis

40600 A Study on Sentiment Analysis Using Various ML/NLP Models on Historical Data of Indian Leaders

Authors: Sarthak Deshpande, Akshay Patil, Pradip Pandhare, Nikhil Wankhede, Rushali Deshmukh

Abstract:

Among the highly significant duties for any language most effective is the sentiment analysis, which is also a key area of NLP, that recently made impressive strides. There are several models and datasets available for those tasks in popular and commonly used languages like English, Russian, and Spanish. While sentiment analysis research is performed extensively, however it is lagging behind for the regional languages having few resources such as Hindi, Marathi. Marathi is one of the languages that included in the Indian Constitution’s 8th schedule and is the third most widely spoken language in the country and primarily spoken in the Deccan region, which encompasses Maharashtra and Goa. There isn’t sufficient study on sentiment analysis methods based on Marathi text due to lack of available resources, information. Therefore, this project proposes the use of different ML/NLP models for the analysis of Marathi data from the comments below YouTube content, tweets or Instagram posts. We aim to achieve a short and precise analysis and summary of the related data using our dataset (Dates, names, root words) and lexicons to locate exact information.

Keywords: multilingual sentiment analysis, Marathi, natural language processing, text summarization, lexicon-based approaches

Procedia PDF Downloads 40
40599 Longitudinal Analysis of Internet Speed Data in the Gulf Cooperation Council Region

Authors: Musab Isah

Abstract:

This paper presents a longitudinal analysis of Internet speed data in the Gulf Cooperation Council (GCC) region, focusing on the most populous cities of each of the six countries – Riyadh, Saudi Arabia; Dubai, UAE; Kuwait City, Kuwait; Doha, Qatar; Manama, Bahrain; and Muscat, Oman. The study utilizes data collected from the Measurement Lab (M-Lab) infrastructure over a five-year period from January 1, 2019, to December 31, 2023. The analysis includes downstream and upstream throughput data for the cities, covering significant events such as the launch of 5G networks in 2019, COVID-19-induced lockdowns in 2020 and 2021, and the subsequent recovery period and return to normalcy. The results showcase substantial increases in Internet speeds across the cities, highlighting improvements in both download and upload throughput over the years. All the GCC countries have achieved above-average Internet speeds that can conveniently support various online activities and applications with excellent user experience.

Keywords: internet data science, internet performance measurement, throughput analysis, internet speed, measurement lab, network diagnostic tool

Procedia PDF Downloads 22
40598 Extreme Temperature Forecast in Mbonge, Cameroon Through Return Level Analysis of the Generalized Extreme Value (GEV) Distribution

Authors: Nkongho Ayuketang Arreyndip, Ebobenow Joseph

Abstract:

In this paper, temperature extremes are forecast by employing the block maxima method of the generalized extreme value (GEV) distribution to analyse temperature data from the Cameroon Development Corporation (CDC). By considering two sets of data (raw data and simulated data) and two (stationary and non-stationary) models of the GEV distribution, return levels analysis is carried out and it was found that in the stationary model, the return values are constant over time with the raw data, while in the simulated data the return values show an increasing trend with an upper bound. In the non-stationary model, the return levels of both the raw data and simulated data show an increasing trend with an upper bound. This clearly shows that although temperatures in the tropics show a sign of increase in the future, there is a maximum temperature at which there is no exceedance. The results of this paper are very vital in agricultural and environmental research.

Keywords: forecasting, generalized extreme value (GEV), meteorology, return level

Procedia PDF Downloads 433
40597 Analysis of Spatial and Temporal Data Using Remote Sensing Technology

Authors: Kapil Pandey, Vishnu Goyal

Abstract:

Spatial and temporal data analysis is very well known in the field of satellite image processing. When spatial data are correlated with time, series analysis it gives the significant results in change detection studies. In this paper the GIS and Remote sensing techniques has been used to find the change detection using time series satellite imagery of Uttarakhand state during the years of 1990-2010. Natural vegetation, urban area, forest cover etc. were chosen as main landuse classes to study. Landuse/ landcover classes within several years were prepared using satellite images. Maximum likelihood supervised classification technique was adopted in this work and finally landuse change index has been generated and graphical models were used to present the changes.

Keywords: GIS, landuse/landcover, spatial and temporal data, remote sensing

Procedia PDF Downloads 404
40596 Discourse Analysis and Semiotic Researches: Using Michael Halliday's Sociosemiotic Theory

Authors: Deyu Yuan

Abstract:

Discourse analysis as an interdisciplinary approach has more than 60-years-history since it was first named by Zellig Harris in 'Discourse Analysis' on Language in 1952. Ferdinand de Saussure differentiated the 'parole' from the 'langue' that established the principle of focusing on language but not speech. So the rising of discourse analysis can be seen as a discursive turn for the entire language research that closely related to the theory of Speech act. Critical discourse analysis becomes the mainstream of contemporary language research through drawing upon M. A. K. Halliday's socio-semiotic theory and Foucault, Barthes, Bourdieu's views on the sign, discourse, and ideology. So in contrast to general semiotics, social semiotics mainly focuses on parole and the application of semiotic theories to some applicable fields. The article attempts to discuss this applicable sociosemiotics and show the features of it that differ from the Saussurian and Peircian semiotics in four aspects: 1) the sign system is about meaning-generation resource in the social context; 2) the sign system conforms to social and cultural changes with the form of metaphor and connotation; 3) sociosemiotics concerns about five applicable principles including the personal authority principle, non-personal authority principle, consistency principle, model demonstration principle, the expertise principle to deepen specific communication; 4) the study of symbolic functions is targeted to the characteristics of ideational, interpersonal and interactional function in social communication process. Then the paper describes six features which characterize this sociosemiotics as applicable semiotics: social, systematic, usable interdisciplinary, dynamic, and multi-modal characteristics. Thirdly, the paper explores the multi-modal choices of sociosemiotics in the respects of genre, discourse, and style. Finally, the paper discusses the relationship between theory and practice in social semiotics and proposes a relatively comprehensive theoretical framework for social semiotics as applicable semiotics.

Keywords: discourse analysis, sociosemiotics, pragmatics, ideology

Procedia PDF Downloads 303
40595 Symbolic Analysis of Input Impedance of CMOS Floating Active Inductors with Application in Fully Differential Bandpass Amplifier

Authors: Kittipong Tripetch

Abstract:

This paper proposes studies of input impedance of two types of the CMOS active inductor. It derives two input impedance formulas. The first formula is the input impedance of a grounded active inductor. The second formula is an input impedance of floating active inductor. After that, these formulas can be used to simulate magnitude and phase response of input impedance as a function of current consumption with MATLAB. Common mode rejection ratio (CMRR) of a fully differential bandpass amplifier is derived based on superposition principle. CMRR as a function of input frequency is plotted as a function of current consumption

Keywords: grounded active inductor, floating active inductor, fully differential bandpass amplifier

Procedia PDF Downloads 401
40594 Accelerating Side Channel Analysis with Distributed and Parallelized Processing

Authors: Kyunghee Oh, Dooho Choi

Abstract:

Although there is no theoretical weakness in a cryptographic algorithm, Side Channel Analysis can find out some secret data from the physical implementation of a cryptosystem. The analysis is based on extra information such as timing information, power consumption, electromagnetic leaks or even sound which can be exploited to break the system. Differential Power Analysis is one of the most popular analyses, as computing the statistical correlations of the secret keys and power consumptions. It is usually necessary to calculate huge data and takes a long time. It may take several weeks for some devices with countermeasures. We suggest and evaluate the methods to shorten the time to analyze cryptosystems. Our methods include distributed computing and parallelized processing.

Keywords: DPA, distributed computing, parallelized processing, side channel analysis

Procedia PDF Downloads 396
40593 On the Estimation of Crime Rate in the Southwest of Nigeria: Principal Component Analysis Approach

Authors: Kayode Balogun, Femi Ayoola

Abstract:

Crime is at alarming rate in this part of world and there are many factors that are contributing to this antisocietal behaviour both among the youths and old. In this work, principal component analysis (PCA) was used as a tool to reduce the dimensionality and to really know those variables that were crime prone in the study region. Data were collected on twenty-eight crime variables from National Bureau of Statistics (NBS) databank for a period of fifteen years, while retaining as much of the information as possible. We use PCA in this study to know the number of major variables and contributors to the crime in the Southwest Nigeria. The results of our analysis revealed that there were eight principal variables have been retained using the Scree plot and Loading plot which implies an eight-equation solution will be appropriate for the data. The eight components explained 93.81% of the total variation in the data set. We also found that the highest and commonly committed crimes in the Southwestern Nigeria were: Assault, Grievous Harm and Wounding, theft/stealing, burglary, house breaking, false pretence, unlawful arms possession and breach of public peace.

Keywords: crime rates, data, Southwest Nigeria, principal component analysis, variables

Procedia PDF Downloads 414
40592 Development of a Risk Disclosure Index and Examination of Its Determinants: An Empirical Study in Indian Context

Authors: M. V. Shivaani, P. K. Jain, Surendra S. Yadav

Abstract:

Worldwide regulators, practitioners and researchers view risk-disclosure as one of the most important steps that will promote corporate accountability and transparency. Recognizing this growing significance of risk disclosures, the paper first develops a risk disclosure index. Covering 69 risk items/themes, this index is developed by employing thematic content analysis and encompasses three attributes of disclosure: namely, nature (qualitative or quantitative), time horizon (backward-looking or forward-looking) and tone (no impact, positive impact or negative impact). As the focus of study is on substantive rather than symbolic disclosure, content analysis has been carried out manually. The study is based on non-financial companies of Nifty500 index and covers a ten year period from April 1, 2005 to March 31, 2015, thus yielding 3,872 annual reports for analysis. The analysis reveals that (on an average) only about 14% of risk items (i.e. about 10 out 69 risk items studied) are being disclosed by Indian companies. Risk items that are frequently disclosed are mostly macroeconomic in nature and their disclosures tend to be qualitative, forward-looking and conveying both positive and negative aspects of the concerned risk. The second objective of the paper is to gauge the factors that affect the level of disclosures in annual reports. Given the panel nature of data, and possible endogeneity amongst variables, Diff-GMM regression has been applied. The results indicate that age and size of firms have a significant positive impact on disclosure quality, whereas growth rate does not have a significant impact. Further, post-recession period (2009-2015) has witnessed significant improvement in quality of disclosures. In terms of corporate governance variables, board size, board independence, CEO duality, presence of CRO and constitution of risk management committee appear to be significant factors in determining the quality of risk disclosures. It is noteworthy that the study contributes to literature by putting forth a variant to existing disclosure indices that not only captures the quantity but also the quality of disclosures (in terms of semantic attributes). Also, the study is a first of its kind attempt in a prominent emerging market i.e. India. Therefore, this study is expected to facilitate regulators in mandating and regulating risk disclosures and companies in their endeavor to reduce information asymmetry.

Keywords: risk disclosure, voluntary disclosures, corporate governance, Diff-GMM

Procedia PDF Downloads 137
40591 Analyzing the Evolution of Adverse Events in Pharmacovigilance: A Data-Driven Approach

Authors: Kwaku Damoah

Abstract:

This study presents a comprehensive data-driven analysis to understand the evolution of adverse events (AEs) in pharmacovigilance. Utilizing data from the FDA Adverse Event Reporting System (FAERS), we employed three analytical methods: rank-based, frequency-based, and percentage change analyses. These methods assessed temporal trends and patterns in AE reporting, focusing on various drug-active ingredients and patient demographics. Our findings reveal significant trends in AE occurrences, with both increasing and decreasing patterns from 2000 to 2023. This research highlights the importance of continuous monitoring and advanced analysis in pharmacovigilance, offering valuable insights for healthcare professionals and policymakers to enhance drug safety.

Keywords: event analysis, FDA adverse event reporting system, pharmacovigilance, temporal trend analysis

Procedia PDF Downloads 25
40590 Customer Churn Analysis in Telecommunication Industry Using Data Mining Approach

Authors: Burcu Oralhan, Zeki Oralhan, Nilsun Sariyer, Kumru Uyar

Abstract:

Data mining has been becoming more and more important and a wide range of applications in recent years. Data mining is the process of find hidden and unknown patterns in big data. One of the applied fields of data mining is Customer Relationship Management. Understanding the relationships between products and customers is crucial for every business. Customer Relationship Management is an approach to focus on customer relationship development, retention and increase on customer satisfaction. In this study, we made an application of a data mining methods in telecommunication customer relationship management side. This study aims to determine the customers profile who likely to leave the system, develop marketing strategies, and customized campaigns for customers. Data are clustered by applying classification techniques for used to determine the churners. As a result of this study, we will obtain knowledge from international telecommunication industry. We will contribute to the understanding and development of this subject in Customer Relationship Management.

Keywords: customer churn analysis, customer relationship management, data mining, telecommunication industry

Procedia PDF Downloads 286
40589 Implementation of Achterbahn-128 for Images Encryption and Decryption

Authors: Aissa Belmeguenai, Khaled Mansouri

Abstract:

In this work, an efficient implementation of Achterbahn-128 for images encryption and decryption was introduced. The implementation for this simulated project is written by MATLAB.7.5. At first two different original images are used for validate the proposed design. Then our developed program was used to transform the original images data into image digits file. Finally, we used our implemented program to encrypt and decrypt images data. Several tests are done for proving the design performance including visual tests and security analysis; we discuss the security analysis of the proposed image encryption scheme including some important ones like key sensitivity analysis, key space analysis, and statistical attacks.

Keywords: Achterbahn-128, stream cipher, image encryption, security analysis

Procedia PDF Downloads 505
40588 Analysis and Rule Extraction of Coronary Artery Disease Data Using Data Mining

Authors: Rezaei Hachesu Peyman, Oliyaee Azadeh, Salahzadeh Zahra, Alizadeh Somayyeh, Safaei Naser

Abstract:

Coronary Artery Disease (CAD) is one major cause of disability in adults and one main cause of death in developed. In this study, data mining techniques including Decision Trees, Artificial neural networks (ANNs), and Support Vector Machine (SVM) analyze CAD data. Data of 4948 patients who had suffered from heart diseases were included in the analysis. CAD is the target variable, and 24 inputs or predictor variables are used for the classification. The performance of these techniques is compared in terms of sensitivity, specificity, and accuracy. The most significant factor influencing CAD is chest pain. Elderly males (age > 53) have a high probability to be diagnosed with CAD. SVM algorithm is the most useful way for evaluation and prediction of CAD patients as compared to non-CAD ones. Application of data mining techniques in analyzing coronary artery diseases is a good method for investigating the existing relationships between variables.

Keywords: classification, coronary artery disease, data-mining, knowledge discovery, extract

Procedia PDF Downloads 632
40587 An Exploratory Analysis of Brisbane's Commuter Travel Patterns Using Smart Card Data

Authors: Ming Wei

Abstract:

Over the past two decades, Location Based Service (LBS) data have been increasingly applied to urban and transportation studies due to their comprehensiveness and consistency. However, compared to other LBS data including mobile phone data, GPS and social networking platforms, smart card data collected from public transport users have arguably yet to be fully exploited in urban systems analysis. By using five weekdays of passenger travel transaction data taken from go card – Southeast Queensland’s transit smart card – this paper analyses the spatiotemporal distribution of passenger movement with regard to the land use patterns in Brisbane. Work and residential places for public transport commuters were identified after extracting journeys-to-work patterns. Our results show that the locations of the workplaces identified from the go card data and residential suburbs are largely consistent with those that were marked in the land use map. However, the intensity for some residential locations in terms of population or commuter densities do not match well between the map and those derived from the go card data. This indicates that the misalignment between residential areas and workplaces to a certain extent, shedding light on how enhancements to service management and infrastructure expansion might be undertaken.

Keywords: big data, smart card data, travel pattern, land use

Procedia PDF Downloads 263
40586 Impact of Foreign Trade on Economic Growth: A Panel Data Analysis for OECD Countries

Authors: Burcu Guvenek, Duygu Baysal Kurt

Abstract:

The impact of foreign trade on economic growth has been discussed since the Classical Economists. Today, foreign trade has become more important for the country's economy with the increasing globalization. When it comes to foreign trade, policies which may vary from country to country and from time to time as protectionism or free trade are implemented. In general, the positive effect of foreign trade on economic growth is alleged. However, as studies supporting this general acceptance take place in the economics literature, there are also studies in the opposite direction. In this paper, the impact of foreign trade on economic growth will be investigated with the help of panel data analysis. For this research, 24 OECD countries’ GDP and foreign trade data, including the period of 1990 and 2010, will be used.

Keywords: foreign trade, economic growth, OECD countries, panel data analysis

Procedia PDF Downloads 359
40585 Using Emerging Hot Spot Analysis to Analyze Overall Effectiveness of Policing Policy and Strategy in Chicago

Authors: Tyler Gill, Sophia Daniels

Abstract:

The paper examines how accessing the spatial-temporal constrains of data will help inform policymakers and law enforcement officials. The authors utilize Chicago crime data from 2006-2016 to demonstrate how the Emerging Hot Spot Tool is an ideal hot spot clustering approach to analyze crime data. Traditional approaches include density maps or creating a spatial weights matrix to include the spatial-temporal constrains. This new approach utilizes a space-time implementation of the Getis-Ord Gi* statistic to visualize the data more quickly to make better decisions. The research will help complement socio-cultural research to find key patterns to help frame future policies and evaluate the implementation of prior strategies. Through this analysis, homicide trends and patterns are found more effectively and recommendations for use by non-traditional users of GIS are offered for real life implementation.

Keywords: crime mapping, emerging hot spot analysis, Getis-Ord Gi*, spatial-temporal analysis

Procedia PDF Downloads 222
40584 Predicting Customer Purchasing Behaviour in Retail Marketing: A Research for a Supermarket Chain

Authors: Sabri Serkan Güllüoğlu

Abstract:

Analysis can be defined as the process of gathering, recording and researching data related to products and services, in order to learn something. But for marketers, analyses are not only used for learning but also an essential and critical part of the business, because this allows companies to offer products or services which are focused and well targeted. Market analysis also identify market trends, demographics, customer’s buying habits and important information on the competition. Data mining is used instead of traditional research, because it extracts predictive information about customer and sales from large databases. In contrast to traditional research, data mining relies on information that is already available. Simply the goal is to improve the efficiency of supermarkets. In this study, the purpose is to find dependency on products. For instance, which items are bought together, using association rules in data mining. Moreover, this information will be used for improving the profitability of customers such as increasing shopping time and sales of fewer sold items.

Keywords: data mining, association rule mining, market basket analysis, purchasing

Procedia PDF Downloads 459
40583 Post Pandemic Mobility Analysis through Indexing and Sharding in MongoDB: Performance Optimization and Insights

Authors: Karan Vishavjit, Aakash Lakra, Shafaq Khan

Abstract:

The COVID-19 pandemic has pushed healthcare professionals to use big data analytics as a vital tool for tracking and evaluating the effects of contagious viruses. To effectively analyze huge datasets, efficient NoSQL databases are needed. The analysis of post-COVID-19 health and well-being outcomes and the evaluation of the effectiveness of government efforts during the pandemic is made possible by this research’s integration of several datasets, which cuts down on query processing time and creates predictive visual artifacts. We recommend applying sharding and indexing technologies to improve query effectiveness and scalability as the dataset expands. Effective data retrieval and analysis are made possible by spreading the datasets into a sharded database and doing indexing on individual shards. Analysis of connections between governmental activities, poverty levels, and post-pandemic well being is the key goal. We want to evaluate the effectiveness of governmental initiatives to improve health and lower poverty levels. We will do this by utilising advanced data analysis and visualisations. The findings provide relevant data that supports the advancement of UN sustainable objectives, future pandemic preparation, and evidence-based decision-making. This study shows how Big Data and NoSQL databases may be used to address problems with global health.

Keywords: big data, COVID-19, health, indexing, NoSQL, sharding, scalability, well being

Procedia PDF Downloads 43
40582 Analyzing the Relationship between the Spatial Characteristics of Cultural Structure, Activities, and the Tourism Demand

Authors: Deniz Karagöz

Abstract:

This study is attempt to comprehend the relationship between the spatial characteristics of cultural structure, activities and the tourism demand in Turkey. The analysis divided into four parts. The first part consisted of a cultural structure and cultural activity (CSCA) index provided by principal component analysis. The analysis determined four distinct dimensions, namely, cultural activity/structure, accessing culture, consumption, and cultural management. The exploratory spatial data analysis employed to determine the spatial models of cultural structure and cultural activities in 81 provinces in Turkey. Global Moran I indices is used to ascertain the cultural activities and the structural clusters. Finally, the relationship between the cultural activities/cultural structure and tourism demand was analyzed. The raw/original data of the study official databases. The data on the cultural structure and activities gathered from the Turkish Statistical Institute and the data related to the tourism demand was provided by the Republic of Turkey Ministry of Culture and Tourism.

Keywords: cultural activities, cultural structure, spatial characteristics, tourism demand, Turkey

Procedia PDF Downloads 524
40581 A Survey of Semantic Integration Approaches in Bioinformatics

Authors: Chaimaa Messaoudi, Rachida Fissoune, Hassan Badir

Abstract:

Technological advances of computer science and data analysis are helping to provide continuously huge volumes of biological data, which are available on the web. Such advances involve and require powerful techniques for data integration to extract pertinent knowledge and information for a specific question. Biomedical exploration of these big data often requires the use of complex queries across multiple autonomous, heterogeneous and distributed data sources. Semantic integration is an active area of research in several disciplines, such as databases, information-integration, and ontology. We provide a survey of some approaches and techniques for integrating biological data, we focus on those developed in the ontology community.

Keywords: biological ontology, linked data, semantic data integration, semantic web

Procedia PDF Downloads 418
40580 Applying Hybrid Graph Drawing and Clustering Methods on Stock Investment Analysis

Authors: Mouataz Zreika, Maria Estela Varua

Abstract:

Stock investment decisions are often made based on current events of the global economy and the analysis of historical data. Conversely, visual representation could assist investors’ gain deeper understanding and better insight on stock market trends more efficiently. The trend analysis is based on long-term data collection. The study adopts a hybrid method that combines the Clustering algorithm and Force-directed algorithm to overcome the scalability problem when visualizing large data. This method exemplifies the potential relationships between each stock, as well as determining the degree of strength and connectivity, which will provide investors another understanding of the stock relationship for reference. Information derived from visualization will also help them make an informed decision. The results of the experiments show that the proposed method is able to produced visualized data aesthetically by providing clearer views for connectivity and edge weights.

Keywords: clustering, force-directed, graph drawing, stock investment analysis

Procedia PDF Downloads 279
40579 Diagrid Structural System

Authors: K. Raghu, Sree Harsha

Abstract:

The interrelationship between the technology and architecture of tall buildings is investigated from the emergence of tall buildings in late 19th century to the present. In the late 19th century early designs of tall buildings recognized the effectiveness of diagonal bracing members in resisting lateral forces. Most of the structural systems deployed for early tall buildings were steel frames with diagonal bracings of various configurations such as X, K, and eccentric. Though the historical research a filtering concept is developed original and remedial technology- through which one can clearly understand inter-relationship between the technical evolution and architectural esthetic and further stylistic transition buildings. Diagonalized grid structures – “diagrids” - have emerged as one of the most innovative and adaptable approaches to structuring buildings in this millennium. Variations of the diagrid system have evolved to the point of making its use non-exclusive to the tall building. Diagrid construction is also to be found in a range of innovative mid-rise steel projects. Contemporary design practice of tall buildings is reviewed and design guidelines are provided for new design trends. Investigated in depths are the behavioral characteristics and design methodology for diagrids structures, which emerge as a new direction in the design of tall buildings with their powerful structural rationale and symbolic architectural expression. Moreover, new technologies for tall building structures and facades are developed for performance enhancement through design integration, and their architectural potentials are explored. By considering the above data the analysis and design of 40-100 storey diagrids steel buildings is carried out using E-TABS software with diagrids of various angle to be found for entire building which will be helpful to reduce the steel requirement for the structure. The present project will have to undertake wind analysis, seismic analysis for lateral loads acting on the structure due to wind loads, earthquake loads, gravity loads. All structural members are designed as per IS 800-2007 considering all load combination. Comparison of results in terms of time period, top storey displacement and inter-storey drift to be carried out. The secondary effect like temperature variations are not considered in the design assuming small variation.

Keywords: diagrid, bracings, structural, building

Procedia PDF Downloads 354
40578 Integrating of Multi-Criteria Decision Making and Spatial Data Warehouse in Geographic Information System

Authors: Zohra Mekranfar, Ahmed Saidi, Abdellah Mebrek

Abstract:

This work aims to develop multi-criteria decision making (MCDM) and spatial data warehouse (SDW) methods, which will be integrated into a GIS according to a ‘GIS dominant’ approach. The GIS operating tools will be operational to operate the SDW. The MCDM methods can provide many solutions to a set of problems with various and multiple criteria. When the problem is so complex, integrating spatial dimension, it makes sense to combine the MCDM process with other approaches like data mining, ascending analyses, we present in this paper an experiment showing a geo-decisional methodology of SWD construction, On-line analytical processing (OLAP) technology which combines both basic multidimensional analysis and the concepts of data mining provides powerful tools to highlight inductions and information not obvious by traditional tools. However, these OLAP tools become more complex in the presence of the spatial dimension. The integration of OLAP with a GIS is the future geographic and spatial information solution. GIS offers advanced functions for the acquisition, storage, analysis, and display of geographic information. However, their effectiveness for complex spatial analysis is questionable due to their determinism and their decisional rigor. A prerequisite for the implementation of any analysis or exploration of spatial data requires the construction and structuring of a spatial data warehouse (SDW). This SDW must be easily usable by the GIS and by the tools offered by an OLAP system.

Keywords: data warehouse, GIS, MCDM, SOLAP

Procedia PDF Downloads 150
40577 Development of an Artificial Neural Network to Measure Science Literacy Leveraging Neuroscience

Authors: Amanda Kavner, Richard Lamb

Abstract:

Faster growth in science and technology of other nations may make staying globally competitive more difficult without shifting focus on how science is taught in US classes. An integral part of learning science involves visual and spatial thinking since complex, and real-world phenomena are often expressed in visual, symbolic, and concrete modes. The primary barrier to spatial thinking and visual literacy in Science, Technology, Engineering, and Math (STEM) fields is representational competence, which includes the ability to generate, transform, analyze and explain representations, as opposed to generic spatial ability. Although the relationship is known between the foundational visual literacy and the domain-specific science literacy, science literacy as a function of science learning is still not well understood. Moreover, the need for a more reliable measure is necessary to design resources which enhance the fundamental visuospatial cognitive processes behind scientific literacy. To support the improvement of students’ representational competence, first visualization skills necessary to process these science representations needed to be identified, which necessitates the development of an instrument to quantitatively measure visual literacy. With such a measure, schools, teachers, and curriculum designers can target the individual skills necessary to improve students’ visual literacy, thereby increasing science achievement. This project details the development of an artificial neural network capable of measuring science literacy using functional Near-Infrared Spectroscopy (fNIR) data. This data was previously collected by Project LENS standing for Leveraging Expertise in Neurotechnologies, a Science of Learning Collaborative Network (SL-CN) of scholars of STEM Education from three US universities (NSF award 1540888), utilizing mental rotation tasks, to assess student visual literacy. Hemodynamic response data from fNIRsoft was exported as an Excel file, with 80 of both 2D Wedge and Dash models (dash) and 3D Stick and Ball models (BL). Complexity data were in an Excel workbook separated by the participant (ID), containing information for both types of tasks. After changing strings to numbers for analysis, spreadsheets with measurement data and complexity data were uploaded to RapidMiner’s TurboPrep and merged. Using RapidMiner Studio, a Gradient Boosted Trees artificial neural network (ANN) consisting of 140 trees with a maximum depth of 7 branches was developed, and 99.7% of the ANN predictions are accurate. The ANN determined the biggest predictors to a successful mental rotation are the individual problem number, the response time and fNIR optode #16, located along the right prefrontal cortex important in processing visuospatial working memory and episodic memory retrieval; both vital for science literacy. With an unbiased measurement of science literacy provided by psychophysiological measurements with an ANN for analysis, educators and curriculum designers will be able to create targeted classroom resources to help improve student visuospatial literacy, therefore improving science literacy.

Keywords: artificial intelligence, artificial neural network, machine learning, science literacy, neuroscience

Procedia PDF Downloads 93
40576 From Data Processing to Experimental Design and Back Again: A Parameter Identification Problem Based on FRAP Images

Authors: Stepan Papacek, Jiri Jablonsky, Radek Kana, Ctirad Matonoha, Stefan Kindermann

Abstract:

FRAP (Fluorescence Recovery After Photobleaching) is a widely used measurement technique to determine the mobility of fluorescent molecules within living cells. While the experimental setup and protocol for FRAP experiments are usually fixed, data processing part is still under development. In this paper, we formulate and solve the problem of data selection which enhances the processing of FRAP images. We introduce the concept of the irrelevant data set, i.e., the data which are almost not reducing the confidence interval of the estimated parameters and thus could be neglected. Based on sensitivity analysis, we both solve the problem of the optimal data space selection and we find specific conditions for optimizing an important experimental design factor, e.g., the radius of bleach spot. Finally, a theorem announcing less precision of the integrated data approach compared to the full data case is proven; i.e., we claim that the data set represented by the FRAP recovery curve lead to a larger confidence interval compared to the spatio-temporal (full) data.

Keywords: FRAP, inverse problem, parameter identification, sensitivity analysis, optimal experimental design

Procedia PDF Downloads 254
40575 A Quantitative Evaluation of Text Feature Selection Methods

Authors: B. S. Harish, M. B. Revanasiddappa

Abstract:

Due to rapid growth of text documents in digital form, automated text classification has become an important research in the last two decades. The major challenge of text document representations are high dimension, sparsity, volume and semantics. Since the terms are only features that can be found in documents, selection of good terms (features) plays an very important role. In text classification, feature selection is a strategy that can be used to improve classification effectiveness, computational efficiency and accuracy. In this paper, we present a quantitative analysis of most widely used feature selection (FS) methods, viz. Term Frequency-Inverse Document Frequency (tfidf ), Mutual Information (MI), Information Gain (IG), CHISquare (x2), Term Frequency-Relevance Frequency (tfrf ), Term Strength (TS), Ambiguity Measure (AM) and Symbolic Feature Selection (SFS) to classify text documents. We evaluated all the feature selection methods on standard datasets like 20 Newsgroups, 4 University dataset and Reuters-21578.

Keywords: classifiers, feature selection, text classification

Procedia PDF Downloads 427
40574 A Highly Accurate Computer-Aided Diagnosis: CAD System for the Diagnosis of Breast Cancer by Using Thermographic Analysis

Authors: Mahdi Bazarganigilani

Abstract:

Computer-aided diagnosis (CAD) systems can play crucial roles in diagnosing crucial diseases such as breast cancer at the earliest. In this paper, a CAD system for the diagnosis of breast cancer was introduced and evaluated. This CAD system was developed by using spatio-temporal analysis of data on a set of consecutive thermographic images by employing wavelet transformation. By using this analysis, a very accurate machine learning model using random forest was obtained. The final results showed a promising accuracy of 91% in terms of the F1 measure indicator among 200 patients' sample data. The CAD system was further extended to obtain a detailed analysis of the effect of smaller sub-areas of each breast on the occurrence of cancer.

Keywords: computer-aided diagnosis systems, thermographic analysis, spatio-temporal analysis, image processing, machine learning

Procedia PDF Downloads 184
40573 Performance of the Cmip5 Models in Simulation of the Present and Future Precipitation over the Lake Victoria Basin

Authors: M. A. Wanzala, L. A. Ogallo, F. J. Opijah, J. N. Mutemi

Abstract:

The usefulness and limitations in climate information are due to uncertainty inherent in the climate system. For any given region to have sustainable development it is important to apply climate information into its socio-economic strategic plans. The overall objective of the study was to assess the performance of the Coupled Model Inter-comparison Project (CMIP5) over the Lake Victoria Basin. The datasets used included the observed point station data, gridded rainfall data from Climate Research Unit (CRU) and hindcast data from eight CMIP5. The methodology included trend analysis, spatial analysis, correlation analysis, Principal Component Analysis (PCA) regression analysis, and categorical statistical skill score. Analysis of the trends in the observed rainfall records indicated an increase in rainfall variability both in space and time for all the seasons. The spatial patterns of the individual models output from the models of MPI, MIROC, EC-EARTH and CNRM were closest to the observed rainfall patterns.

Keywords: categorical statistics, coupled model inter-comparison project, principal component analysis, statistical downscaling

Procedia PDF Downloads 344
40572 Road Safety in the Great Britain: An Exploratory Data Analysis

Authors: Jatin Kumar Choudhary, Naren Rayala, Abbas Eslami Kiasari, Fahimeh Jafari

Abstract:

The Great Britain has one of the safest road networks in the world. However, the consequences of any death or serious injury are devastating for loved ones, as well as for those who help the severely injured. This paper aims to analyse the Great Britain's road safety situation and show the response measures for areas where the total damage caused by accidents can be significantly and quickly reduced. In this paper, we do an exploratory data analysis using STATS19 data. For the past 30 years, the UK has had a good record in reducing fatalities. The UK ranked third based on the number of road deaths per million inhabitants. There were around 165,000 accidents reported in the Great Britain in 2009 and it has been decreasing every year until 2019 which is under 120,000. The government continues to scale back road deaths empowering responsible road users by identifying and prosecuting the parameters that make the roads less safe.

Keywords: road safety, data analysis, openstreetmap, feature expanding.

Procedia PDF Downloads 107
40571 Qualitative Data Analysis for Health Care Services

Authors: Taner Ersoz, Filiz Ersoz

Abstract:

This study was designed enable application of multivariate technique in the interpretation of categorical data for measuring health care services satisfaction in Turkey. The data was collected from a total of 17726 respondents. The establishment of the sample group and collection of the data were carried out by a joint team from The Ministry of Health and Turkish Statistical Institute (Turk Stat) of Turkey. The multiple correspondence analysis (MCA) was used on the data of 2882 respondents who answered the questionnaire in full. The multiple correspondence analysis indicated that, in the evaluation of health services females, public employees, younger and more highly educated individuals were more concerned and complainant than males, private sector employees, older and less educated individuals. Overall 53 % of the respondents were pleased with the improvements in health care services in the past three years. This study demonstrates the public consciousness in health services and health care satisfaction in Turkey. It was found that most the respondents were pleased with the improvements in health care services over the past three years. Awareness of health service quality increases with education levels. Older individuals and males would appear to have lower expectancies in health services.

Keywords: multiple correspondence analysis, multivariate categorical data, health care services, health satisfaction survey

Procedia PDF Downloads 207