Search results for: R data science
Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 26140

Search results for: R data science

24220 Mining Scientific Literature to Discover Potential Research Data Sources: An Exploratory Study in the Field of Haemato-Oncology

Authors: A. Anastasiou, K. S. Tingay

Abstract:

Background: Discovering suitable datasets is an important part of health research, particularly for projects working with clinical data from patients organized in cohorts (cohort data), but with the proliferation of so many national and international initiatives, it is becoming increasingly difficult for research teams to locate real world datasets that are most relevant to their project objectives. We present a method for identifying healthcare institutes in the European Union (EU) which may hold haemato-oncology (HO) data. A key enabler of this research was the bibInsight platform, a scientometric data management and analysis system developed by the authors at Swansea University. Method: A PubMed search was conducted using HO clinical terms taken from previous work. The resulting XML file was processed using the bibInsight platform, linking affiliations to the Global Research Identifier Database (GRID). GRID is an international, standardized list of institutions, including the city and country in which the institution exists, as well as a category of the main business type, e.g., Academic, Healthcare, Government, Company. Countries were limited to the 28 current EU members, and institute type to 'Healthcare'. An article was considered valid if at least one author was affiliated with an EU-based healthcare institute. Results: The PubMed search produced 21,310 articles, consisting of 9,885 distinct affiliations with correspondence in GRID. Of these articles, 760 were from EU countries, and 390 of these were healthcare institutes. One affiliation was excluded as being a veterinary hospital. Two EU countries did not have any publications in our analysis dataset. The results were analysed by country and by individual healthcare institute. Networks both within the EU and internationally show institutional collaborations, which may suggest a willingness to share data for research purposes. Geographical mapping can ensure that data has broad population coverage. Collaborations with industry or government may exclude healthcare institutes that may have embargos or additional costs associated with data access. Conclusions: Data reuse is becoming increasingly important both for ensuring the validity of results, and economy of available resources. The ability to identify potential, specific data sources from over twenty thousand articles in less than an hour could assist in improving knowledge of, and access to, data sources. As our method has not yet specified if these healthcare institutes are holding data, or merely publishing on that topic, future work will involve text mining of data-specific concordant terms to identify numbers of participants, demographics, study methodologies, and sub-topics of interest.

Keywords: data reuse, data discovery, data linkage, journal articles, text mining

Procedia PDF Downloads 103
24219 Using Data Mining Technique for Scholarship Disbursement

Authors: J. K. Alhassan, S. A. Lawal

Abstract:

This work is on decision tree-based classification for the disbursement of scholarship. Tree-based data mining classification technique is used in other to determine the generic rule to be used to disburse the scholarship. The system based on the defined rules from the tree is able to determine the class (status) to which an applicant shall belong whether Granted or Not Granted. The applicants that fall to the class of granted denote a successful acquirement of scholarship while those in not granted class are unsuccessful in the scheme. An algorithm that can be used to classify the applicants based on the rules from tree-based classification was also developed. The tree-based classification is adopted because of its efficiency, effectiveness, and easy to comprehend features. The system was tested with the data of National Information Technology Development Agency (NITDA) Abuja, a Parastatal of Federal Ministry of Communication Technology that is mandated to develop and regulate information technology in Nigeria. The system was found working according to the specification. It is therefore recommended for all scholarship disbursement organizations.

Keywords: classification, data mining, decision tree, scholarship

Procedia PDF Downloads 354
24218 [Keynote Speech]: Feature Selection and Predictive Modeling of Housing Data Using Random Forest

Authors: Bharatendra Rai

Abstract:

Predictive data analysis and modeling involving machine learning techniques become challenging in presence of too many explanatory variables or features. Presence of too many features in machine learning is known to not only cause algorithms to slow down, but they can also lead to decrease in model prediction accuracy. This study involves housing dataset with 79 quantitative and qualitative features that describe various aspects people consider while buying a new house. Boruta algorithm that supports feature selection using a wrapper approach build around random forest is used in this study. This feature selection process leads to 49 confirmed features which are then used for developing predictive random forest models. The study also explores five different data partitioning ratios and their impact on model accuracy are captured using coefficient of determination (r-square) and root mean square error (rsme).

Keywords: housing data, feature selection, random forest, Boruta algorithm, root mean square error

Procedia PDF Downloads 306
24217 The Need For Higher Education Stem Integrated into the Social Science

Authors: Luis Fernando Calvo Prieto, Raul Herrero Martínez, Mónica Santamarta Llorente, Sergio Paniagua Bermejo

Abstract:

The project that is presented starts from the questioning about the compartmentalization of knowledge that occurs in university higher education. There are several authors who describe the problems associated with this reality (Rodamillans, M) indicating a lack of integration of the knowledge acquired by students throughout the subjects taken in their university degree. Furthermore, this disintegration is accentuated by the enrollment system of some Faculties and/or Schools of Engineering, which allows the student to take subjects outside the recommended curricular path. This problem is accentuated in an ostentatious way when trying to find an integration between humanistic subjects and the world of experimental sciences or engineering. This abrupt separation between humanities and sciences can be observed in any study plan of Spanish degrees. Except for subjects such as economics or English, in the Faculties of Sciences and the Schools of Engineering, the absence of any humanistic content is striking. At some point it was decided that the only value to take into account when designing their study plans was “usefulness”, considering the humanities systematically useless for their training, and therefore banishing them from the study plans. forgetting the role they have on the capacity of both Leadership and Civic Humanism in our professionals of tomorrow. The teaching guides for the different subjects in the branch of science or engineering do not include any competency, not even transversal, related to leadership capacity or the need, in today's world, for social, civic and humanitarian knowledge part of the people who will offer medical, pharmaceutical, environmental, biotechnological or engineering solutions to a society that is generated thanks to more or less complex relationships based on human relationships and historical events that have occurred so far. If we want professionals who know how to deal effectively and rationally with their leadership tasks and who, in addition, find and develop an ethically civic sense and a humanistic profile in their functions and scientific tasks, we must not leave aside the importance that it has, for the themselves, know the causes, facts and consequences of key events in the history of humanity. The words of the humanist Paul Preston are well known: “he who does not know his history is condemned to repeat the mistakes of the past.” The idea, therefore, that today there can be men of science in the way that the scientists of the Renaissance were, becomes, at the very least, difficult to conceive. To think that a Leonardo da Vinci can be repeated in current times is a more than crazy idea; and although at first it may seem that the specialization of a professional is inevitable but beneficial, there are authors who consider (Sánchez Inarejos) that it has an extremely serious negative side effect: the entrenchment behind the different postulates of each area of knowledge, disdaining everything. what is foreign to it.

Keywords: STEM, higher education, social sciences, history

Procedia PDF Downloads 48
24216 Image-Based (RBG) Technique for Estimating Phosphorus Levels of Different Crops

Authors: M. M. Ali, Ahmed Al- Ani, Derek Eamus, Daniel K. Y. Tan

Abstract:

In this glasshouse study, we developed the new image-based non-destructive technique for detecting leaf P status of different crops such as cotton, tomato and lettuce. Plants were allowed to grow on nutrient media containing different P concentrations, i.e. 0%, 50% and 100% of recommended P concentration (P0 = no P, L; P1 = 2.5 mL 10 L-1 of P and P2 = 5 mL 10 L-1 of P as NaH2PO4). After 10 weeks of growth, plants were harvested and data on leaf P contents were collected using the standard destructive laboratory method and at the same time leaf images were collected by a handheld crop image sensor. We calculated leaf area, leaf perimeter and RGB (red, green and blue) values of these images. This data was further used in the linear discriminant analysis (LDA) to estimate leaf P contents, which successfully classified these plants on the basis of leaf P contents. The data indicated that P deficiency in crop plants can be predicted using the image and morphological data. Our proposed non-destructive imaging method is precise in estimating P requirements of different crop species.

Keywords: image-based techniques, leaf area, leaf P contents, linear discriminant analysis

Procedia PDF Downloads 361
24215 Design of Visual Repository, Constraint and Process Modeling Tool Based on Eclipse Plug-Ins

Authors: Rushiraj Heshi, Smriti Bhandari

Abstract:

Master Data Management requires creation of Central repository, applying constraints on Repository and designing processes to manage data. Designing of Repository, constraints on repository and business processes is very tedious and time consuming task for large Enterprise. Hence Visual Repository, constraints and Process (Workflow) modeling is the most critical step in Master Data Management.In this paper, we realize a Visual Modeling tool for implementing Repositories, Constraints and Processes based on Eclipse Plugin using GMF/EMF which follows principles of Model Driven Engineering (MDE).

Keywords: EMF, GMF, GEF, repository, constraint, process

Procedia PDF Downloads 480
24214 Design and Development of Data Visualization in 2D and 3D Space Using Front-End Technologies

Authors: Sourabh Yaduvanshi, Varsha Namdeo, Namrata Yaduvanshi

Abstract:

This study delves into the design and development intricacies of crafting detailed 2D bar charts via d3.js, recognizing its limitations in generating 3D visuals within the DOM. The study combines three.js with d3.js, facilitating a smooth evolution from 2D to immersive 3D representations. This fusion epitomizes the synergy between front-end technologies, expanding horizons in data visualization. Beyond technical expertise, it symbolizes a creative convergence, pushing boundaries in visual representation. The abstract illuminates methodologies, unraveling the intricate integration of this fusion and guiding enthusiasts. It narrates a compelling story of transcending 2D constraints, propelling data visualization into captivating three-dimensional realms, and igniting creativity in front-end visualization endeavors.

Keywords: design, development, front-end technologies, visualization

Procedia PDF Downloads 60
24213 The Classification Performance in Parametric and Nonparametric Discriminant Analysis for a Class- Unbalanced Data of Diabetes Risk Groups

Authors: Lily Ingsrisawang, Tasanee Nacharoen

Abstract:

Introduction: The problems of unbalanced data sets generally appear in real world applications. Due to unequal class distribution, many research papers found that the performance of existing classifier tends to be biased towards the majority class. The k -nearest neighbors’ nonparametric discriminant analysis is one method that was proposed for classifying unbalanced classes with good performance. Hence, the methods of discriminant analysis are of interest to us in investigating misclassification error rates for class-imbalanced data of three diabetes risk groups. Objective: The purpose of this study was to compare the classification performance between parametric discriminant analysis and nonparametric discriminant analysis in a three-class classification application of class-imbalanced data of diabetes risk groups. Methods: Data from a healthy project for 599 staffs in a government hospital in Bangkok were obtained for the classification problem. The staffs were diagnosed into one of three diabetes risk groups: non-risk (90%), risk (5%), and diabetic (5%). The original data along with the variables; diabetes risk group, age, gender, cholesterol, and BMI was analyzed and bootstrapped up to 50 and 100 samples, 599 observations per sample, for additional estimation of misclassification error rate. Each data set was explored for the departure of multivariate normality and the equality of covariance matrices of the three risk groups. Both the original data and the bootstrap samples show non-normality and unequal covariance matrices. The parametric linear discriminant function, quadratic discriminant function, and the nonparametric k-nearest neighbors’ discriminant function were performed over 50 and 100 bootstrap samples and applied to the original data. In finding the optimal classification rule, the choices of prior probabilities were set up for both equal proportions (0.33: 0.33: 0.33) and unequal proportions with three choices of (0.90:0.05:0.05), (0.80: 0.10: 0.10) or (0.70, 0.15, 0.15). Results: The results from 50 and 100 bootstrap samples indicated that the k-nearest neighbors approach when k = 3 or k = 4 and the prior probabilities of {non-risk:risk:diabetic} as {0.90:0.05:0.05} or {0.80:0.10:0.10} gave the smallest error rate of misclassification. Conclusion: The k-nearest neighbors approach would be suggested for classifying a three-class-imbalanced data of diabetes risk groups.

Keywords: error rate, bootstrap, diabetes risk groups, k-nearest neighbors

Procedia PDF Downloads 424
24212 BFDD-S: Big Data Framework to Detect and Mitigate DDoS Attack in SDN Network

Authors: Amirreza Fazely Hamedani, Muzzamil Aziz, Philipp Wieder, Ramin Yahyapour

Abstract:

Software-defined networking in recent years came into the sight of so many network designers as a successor to the traditional networking. Unlike traditional networks where control and data planes engage together within a single device in the network infrastructure such as switches and routers, the two planes are kept separated in software-defined networks (SDNs). All critical decisions about packet routing are made on the network controller, and the data level devices forward the packets based on these decisions. This type of network is vulnerable to DDoS attacks, degrading the overall functioning and performance of the network by continuously injecting the fake flows into it. This increases substantial burden on the controller side, and the result ultimately leads to the inaccessibility of the controller and the lack of network service to the legitimate users. Thus, the protection of this novel network architecture against denial of service attacks is essential. In the world of cybersecurity, attacks and new threats emerge every day. It is essential to have tools capable of managing and analyzing all this new information to detect possible attacks in real-time. These tools should provide a comprehensive solution to automatically detect, predict and prevent abnormalities in the network. Big data encompasses a wide range of studies, but it mainly refers to the massive amounts of structured and unstructured data that organizations deal with on a regular basis. On the other hand, it regards not only the volume of the data; but also that how data-driven information can be used to enhance decision-making processes, security, and the overall efficiency of a business. This paper presents an intelligent big data framework as a solution to handle illegitimate traffic burden on the SDN network created by the numerous DDoS attacks. The framework entails an efficient defence and monitoring mechanism against DDoS attacks by employing the state of the art machine learning techniques.

Keywords: apache spark, apache kafka, big data, DDoS attack, machine learning, SDN network

Procedia PDF Downloads 158
24211 Demographic Dividend and Creation of Human and Knowledge Capital in Liberal India: An Endogenous Growth Process

Authors: Arjun K., Arumugam Sankaran, Sanjay Kumar, Mousumi Das

Abstract:

The paper analyses the existence of endogenous growth scenario emanating from the demographic dividend in India during the liberalization period starting from 1980. Demographic dividend creates a fertile ground for the cultivation of human and knowledge capitals contributing to technological progress which can be measured using total factor productivity. The relationship among total factor productivity, human and knowledge capitals are examined in an open endogenous framework for the period 1980-2016. The control variables such as foreign direct investment, trade openness, energy consumption are also employed. The data are sourced from Reserve Bank of India, World Bank, International Energy Agency and The National Science and Technology Management Information System. To understand the dynamic association among variables, ARDL bounds approach to cointegration followed by Toda-Yamamoto causality test are used. The results reveal a short run and long run relationship among the variables supported by the existence of causality. This calls for an integrated policy to build and augment human capital and research and development activities to sustain and pace up growth and development in the nation.

Keywords: demographic dividend, young population, open endogenous growth models, human and knowledge capital

Procedia PDF Downloads 134
24210 Welding Process Selection for Storage Tank by Integrated Data Envelopment Analysis and Fuzzy Credibility Constrained Programming Approach

Authors: Rahmad Wisnu Wardana, Eakachai Warinsiriruk, Sutep Joy-A-Ka

Abstract:

Selecting the most suitable welding process usually depends on experiences or common application in similar companies. However, this approach generally ignores many criteria that can be affecting the suitable welding process selection. Therefore, knowledge automation through knowledge-based systems will significantly improve the decision-making process. The aims of this research propose integrated data envelopment analysis (DEA) and fuzzy credibility constrained programming approach for identifying the best welding process for stainless steel storage tank in the food and beverage industry. The proposed approach uses fuzzy concept and credibility measure to deal with uncertain data from experts' judgment. Furthermore, 12 parameters are used to determine the most appropriate welding processes among six competitive welding processes.

Keywords: welding process selection, data envelopment analysis, fuzzy credibility constrained programming, storage tank

Procedia PDF Downloads 154
24209 Reflection of Development of Production Relations in Museums: Case of Gobustan Museum

Authors: Fikrat Abdullayev, Narmin Huseynli

Abstract:

Archaeology is a science that learns ancient people’s life and household on the basis of samples of material culture. The key research object of this science is artefacts, which are acquired during archaeological excavations. These artefacts can be seen in museums. Museums are the main institutions that give impressions of daily life and household of people in ancient times. Therefore, systematization, exhibition and presentation of archaeological items in museums should be adapted to trace the development of productive forces and its reflection on the household of people. In Gobustan museum which was commissioned in 2011, you can get information about the life and household, as well as religious beliefs, of people at all stages of history from the end of the Upper Palaeolith to the Middle Ages through archaeological items, rock inscriptions and modern technologies. The main idea of museum exposition is to give an idea to visitors about the environment, society and production relations during the Stone and Metal Age. Stimulation of development of production factors and production relationships of environmental factors that are influenced by natural forces can be easily seen through exhibits of Gobustan Museum. At the same time, creating of new ideological attributes in the changing society and the process of transforming people into a dominant position in a belief system can be seen in the substitution of motives of rock carvings in the chronological context. The historical and cultural essence of rock carvings in Gobustan Museum is demonstrated through modern technological means and traditional museum concepts. In addition, Gobustan Preserve is one of the rare places where visitors can directly contact with rock carvings.

Keywords: Gobustan, rock art, museum, productive forces

Procedia PDF Downloads 495
24208 On the Estimation of Crime Rate in the Southwest of Nigeria: Principal Component Analysis Approach

Authors: Kayode Balogun, Femi Ayoola

Abstract:

Crime is at alarming rate in this part of world and there are many factors that are contributing to this antisocietal behaviour both among the youths and old. In this work, principal component analysis (PCA) was used as a tool to reduce the dimensionality and to really know those variables that were crime prone in the study region. Data were collected on twenty-eight crime variables from National Bureau of Statistics (NBS) databank for a period of fifteen years, while retaining as much of the information as possible. We use PCA in this study to know the number of major variables and contributors to the crime in the Southwest Nigeria. The results of our analysis revealed that there were eight principal variables have been retained using the Scree plot and Loading plot which implies an eight-equation solution will be appropriate for the data. The eight components explained 93.81% of the total variation in the data set. We also found that the highest and commonly committed crimes in the Southwestern Nigeria were: Assault, Grievous Harm and Wounding, theft/stealing, burglary, house breaking, false pretence, unlawful arms possession and breach of public peace.

Keywords: crime rates, data, Southwest Nigeria, principal component analysis, variables

Procedia PDF Downloads 428
24207 On-Line Data-Driven Multivariate Statistical Prediction Approach to Production Monitoring

Authors: Hyun-Woo Cho

Abstract:

Detection of incipient abnormal events in production processes is important to improve safety and reliability of manufacturing operations and reduce losses caused by failures. The construction of calibration models for predicting faulty conditions is quite essential in making decisions on when to perform preventive maintenance. This paper presents a multivariate calibration monitoring approach based on the statistical analysis of process measurement data. The calibration model is used to predict faulty conditions from historical reference data. This approach utilizes variable selection techniques, and the predictive performance of several prediction methods are evaluated using real data. The results shows that the calibration model based on supervised probabilistic model yielded best performance in this work. By adopting a proper variable selection scheme in calibration models, the prediction performance can be improved by excluding non-informative variables from their model building steps.

Keywords: calibration model, monitoring, quality improvement, feature selection

Procedia PDF Downloads 344
24206 Interactive IoT-Blockchain System for Big Data Processing

Authors: Abdallah Al-ZoubI, Mamoun Dmour

Abstract:

The spectrum of IoT devices is becoming widely diversified, entering almost all possible fields and finding applications in industry, health, finance, logistics, education, to name a few. The IoT active endpoint sensors and devices exceeded the 12 billion mark in 2021 and are expected to reach 27 billion in 2025, with over $34 billion in total market value. This sheer rise in numbers and use of IoT devices bring with it considerable concerns regarding data storage, analysis, manipulation and protection. IoT Blockchain-based systems have recently been proposed as a decentralized solution for large-scale data storage and protection. COVID-19 has actually accelerated the desire to utilize IoT devices as it impacted both demand and supply and significantly affected several regions due to logistic reasons such as supply chain interruptions, shortage of shipping containers and port congestion. An IoT-blockchain system is proposed to handle big data generated by a distributed network of sensors and controllers in an interactive manner. The system is designed using the Ethereum platform, which utilizes smart contracts, programmed in solidity to execute and manage data generated by IoT sensors and devices. such as Raspberry Pi 4, Rasbpian, and add-on hardware security modules. The proposed system will run a number of applications hosted by a local machine used to validate transactions. It then sends data to the rest of the network through InterPlanetary File System (IPFS) and Ethereum Swarm, forming a closed IoT ecosystem run by blockchain where a number of distributed IoT devices can communicate and interact, thus forming a closed, controlled environment. A prototype has been deployed with three IoT handling units distributed over a wide geographical space in order to examine its feasibility, performance and costs. Initial results indicated that big IoT data retrieval and storage is feasible and interactivity is possible, provided that certain conditions of cost, speed and thorough put are met.

Keywords: IoT devices, blockchain, Ethereum, big data

Procedia PDF Downloads 133
24205 Keynote Talk: The Role of Internet of Things in the Smart Cities Power System

Authors: Abdul-Rahman Al-Ali

Abstract:

As the number of mobile devices is growing exponentially, it is estimated to connect about 50 million devices to the Internet by the year 2020. At the end of this decade, it is expected that an average of eight connected devices per person worldwide. The 50 billion devices are not mobile phones and data browsing gadgets only, but machine-to-machine and man-to-machine devices. With such growing numbers of devices the Internet of Things (I.o.T) concept is one of the emerging technologies as of recently. Within the smart grid technologies, smart home appliances, Intelligent Electronic Devices (IED) and Distributed Energy Resources (DER) are major I.o.T objects that can be addressable using the IPV6. These objects are called the smart grid internet of things (SG-I.o.T). The SG-I.o.T generates big data that requires high-speed computing infrastructure, widespread computer networks, big data storage, software, and platforms services. A company’s utility control and data centers cannot handle such a large number of devices, high-speed processing, and massive data storage. Building large data center’s infrastructure takes a long time, it also requires widespread communication networks and huge capital investment. To maintain and upgrade control and data centers’ infrastructure and communication networks as well as updating and renewing software licenses which collectively, requires additional cost. This can be overcome by utilizing the emerging computing paradigms such as cloud computing. This can be used as a smart grid enabler to replace the legacy of utilities data centers. The talk will highlight the role of I.o.T, cloud computing services and their development models within the smart grid technologies.

Keywords: intelligent electronic devices (IED), distributed energy resources (DER), internet, smart home appliances

Procedia PDF Downloads 308
24204 Statistical Analysis of Interferon-γ for the Effectiveness of an Anti-Tuberculous Treatment

Authors: Shishen Xie, Yingda L. Xie

Abstract:

Tuberculosis (TB) is a potentially serious infectious disease that remains a health concern. The Interferon Gamma Release Assay (IGRA) is a blood test to find out if an individual is tuberculous positive or negative. This study applies statistical analysis to the clinical data of interferon-gamma levels of seventy-three subjects who diagnosed pulmonary TB in an anti-tuberculous treatment. Data analysis is performed to determine if there is a significant decline in interferon-gamma levels for the subjects during a period of six months, and to infer if the anti-tuberculous treatment is effective.

Keywords: data analysis, interferon gamma release assay, statistical methods, tuberculosis infection

Procedia PDF Downloads 293
24203 Short Text Classification Using Part of Speech Feature to Analyze Students' Feedback of Assessment Components

Authors: Zainab Mutlaq Ibrahim, Mohamed Bader-El-Den, Mihaela Cocea

Abstract:

Students' textual feedback can hold unique patterns and useful information about learning process, it can hold information about advantages and disadvantages of teaching methods, assessment components, facilities, and other aspects of teaching. The results of analysing such a feedback can form a key point for institutions’ decision makers to advance and update their systems accordingly. This paper proposes a data mining framework for analysing end of unit general textual feedback using part of speech feature (PoS) with four machine learning algorithms: support vector machines, decision tree, random forest, and naive bays. The proposed framework has two tasks: first, to use the above algorithms to build an optimal model that automatically classifies the whole data set into two subsets, one subset is tailored to assessment practices (assessment related), and the other one is the non-assessment related data. Second task to use the same algorithms to build an optimal model for whole data set, and the new data subsets to automatically detect their sentiment. The significance of this paper is to compare the performance of the above four algorithms using part of speech feature to the performance of the same algorithms using n-grams feature. The paper follows Knowledge Discovery and Data Mining (KDDM) framework to construct the classification and sentiment analysis models, which is understanding the assessment domain, cleaning and pre-processing the data set, selecting and running the data mining algorithm, interpreting mined patterns, and consolidating the discovered knowledge. The results of this paper experiments show that both models which used both features performed very well regarding first task. But regarding the second task, models that used part of speech feature has underperformed in comparison with models that used unigrams and bigrams.

Keywords: assessment, part of speech, sentiment analysis, student feedback

Procedia PDF Downloads 127
24202 Fast Fourier Transform-Based Steganalysis of Covert Communications over Streaming Media

Authors: Jinghui Peng, Shanyu Tang, Jia Li

Abstract:

Steganalysis seeks to detect the presence of secret data embedded in cover objects, and there is an imminent demand to detect hidden messages in streaming media. This paper shows how a steganalysis algorithm based on Fast Fourier Transform (FFT) can be used to detect the existence of secret data embedded in streaming media. The proposed algorithm uses machine parameter characteristics and a network sniffer to determine whether the Internet traffic contains streaming channels. The detected streaming data is then transferred from the time domain to the frequency domain through FFT. The distributions of power spectra in the frequency domain between original VoIP streams and stego VoIP streams are compared in turn using t-test, achieving the p-value of 7.5686E-176 which is below the threshold. The results indicate that the proposed FFT-based steganalysis algorithm is effective in detecting the secret data embedded in VoIP streaming media.

Keywords: steganalysis, security, Fast Fourier Transform, streaming media

Procedia PDF Downloads 130
24201 Privacy-Preserving Model for Social Network Sites to Prevent Unwanted Information Diffusion

Authors: Sanaz Kavianpour, Zuraini Ismail, Bharanidharan Shanmugam

Abstract:

Social Network Sites (SNSs) can be served as an invaluable platform to transfer the information across a large number of individuals. A substantial component of communicating and managing information is to identify which individual will influence others in propagating information and also whether dissemination of information in the absence of social signals about that information will be occurred or not. Classifying the final audience of social data is difficult as controlling the social contexts which transfers among individuals are not completely possible. Hence, undesirable information diffusion to an unauthorized individual on SNSs can threaten individuals’ privacy. This paper highlights the information diffusion in SNSs and moreover it emphasizes the most significant privacy issues to individuals of SNSs. The goal of this paper is to propose a privacy-preserving model that has urgent regards with individuals’ data in order to control availability of data and improve privacy by providing access to the data for an appropriate third parties without compromising the advantages of information sharing through SNSs.

Keywords: anonymization algorithm, classification algorithm, information diffusion, privacy, social network sites

Procedia PDF Downloads 309
24200 Using Focus Groups to Identify Mon Set Menus of Bang Kadi Community in Bangkok

Authors: S. Nitiworakarn

Abstract:

In recent years, focus-group discussions, as a resources of qualitative facts collection, have gained popularity amongst practices within social science studies. Despite this popularity, studying qualitative information, particularly focus-group meetings, creates a challenge to most practitioner inspectors. The Mons, also known as Raman is considered to be one of the earliest peoples in mainland South-East Asia and to be found in scattered communities in Thailand, around the central valley and even in Bangkok. The present project responds to the needs identified traditional Mon set menus based on the participation of Bang Kadi community in Bangkok, Thailand. The aim of this study was to generate Mon food set menus based on the participation of the community and to study Mon food in set menus of Bang Kadi population by focus-group interviews and discussions during May to October 2015 of Bang Kadi community in Bangkok, Thailand. Data were collected using (1) focus group discussion between the researcher and 147 people in the community, including community leaders, women of the community and the elderly of the community (2) cooking between the researcher and 22 residents of the community. After the focus group discussion, the results found that Mon set menus of Bang Kadi residents involved of Kang Neng Kua-dit, Kang Luk-yom, Kang Som-Kajaeb, Kangleng Puk-pung, Yum Cha-cam, Pik-pa, Kao-new dek-ha and Num Ma-toom and the ingredients used in cooking are mainly found in local and seasonal regime. Most of foods in set menus are consequent from local wisdom.

Keywords: focus groups, Mon Food, set menus, Bangkok

Procedia PDF Downloads 400
24199 Application Difference between Cox and Logistic Regression Models

Authors: Idrissa Kayijuka

Abstract:

The logistic regression and Cox regression models (proportional hazard model) at present are being employed in the analysis of prospective epidemiologic research looking into risk factors in their application on chronic diseases. However, a theoretical relationship between the two models has been studied. By definition, Cox regression model also called Cox proportional hazard model is a procedure that is used in modeling data regarding time leading up to an event where censored cases exist. Whereas the Logistic regression model is mostly applicable in cases where the independent variables consist of numerical as well as nominal values while the resultant variable is binary (dichotomous). Arguments and findings of many researchers focused on the overview of Cox and Logistic regression models and their different applications in different areas. In this work, the analysis is done on secondary data whose source is SPSS exercise data on BREAST CANCER with a sample size of 1121 women where the main objective is to show the application difference between Cox regression model and logistic regression model based on factors that cause women to die due to breast cancer. Thus we did some analysis manually i.e. on lymph nodes status, and SPSS software helped to analyze the mentioned data. This study found out that there is an application difference between Cox and Logistic regression models which is Cox regression model is used if one wishes to analyze data which also include the follow-up time whereas Logistic regression model analyzes data without follow-up-time. Also, they have measurements of association which is different: hazard ratio and odds ratio for Cox and logistic regression models respectively. A similarity between the two models is that they are both applicable in the prediction of the upshot of a categorical variable i.e. a variable that can accommodate only a restricted number of categories. In conclusion, Cox regression model differs from logistic regression by assessing a rate instead of proportion. The two models can be applied in many other researches since they are suitable methods for analyzing data but the more recommended is the Cox, regression model.

Keywords: logistic regression model, Cox regression model, survival analysis, hazard ratio

Procedia PDF Downloads 439
24198 Text Mining of Twitter Data Using a Latent Dirichlet Allocation Topic Model and Sentiment Analysis

Authors: Sidi Yang, Haiyi Zhang

Abstract:

Twitter is a microblogging platform, where millions of users daily share their attitudes, views, and opinions. Using a probabilistic Latent Dirichlet Allocation (LDA) topic model to discern the most popular topics in the Twitter data is an effective way to analyze a large set of tweets to find a set of topics in a computationally efficient manner. Sentiment analysis provides an effective method to show the emotions and sentiments found in each tweet and an efficient way to summarize the results in a manner that is clearly understood. The primary goal of this paper is to explore text mining, extract and analyze useful information from unstructured text using two approaches: LDA topic modelling and sentiment analysis by examining Twitter plain text data in English. These two methods allow people to dig data more effectively and efficiently. LDA topic model and sentiment analysis can also be applied to provide insight views in business and scientific fields.

Keywords: text mining, Twitter, topic model, sentiment analysis

Procedia PDF Downloads 162
24197 How Context and Problem Based Learning Effects Students Behaviors in Teaching Thermodynamics

Authors: Mukadder Baran, Mustafa Sözbilir

Abstract:

The purpose of this paper is to investigate the applicabillity of the Context- and Problem-Based Learning (CPBL) in general chemistry course to the subject of “Thermodynamics” but also the influence of CPBL on students’ achievement, retention of knowledge, their interest, attitudes, motivation and problem-solving skills. The study group included 13 freshman students who were selected with the sampling method appropriate to the purpose among those taking the course of General Chemistry within the Program of Medical Laboratory Techniques at Hakkari University. The application was carried out in the Spring Term of the academic year of 2012-2013. As the data collection tool, Lesson Observation form were used. In the light of the observations held, it was revealed that CPBL increased the students’ intragroup and intergroup communication skills as well as their self-confidence and developed their skills in time management, presentation, reporting, and technology use; and that they were able to relate chemistry to daily life. Depending on these findings, it could be suggested that the area of use of CPBL be widened; that seminars related to constructive methods be organized for teachers. In this way, it is believed that students will not be passive in the group any longer. In addition, it was concluded that in order to avoid the negative effects of the socio-cultural structure on the education system, research should be conducted in places where there is socio-cultural obstacles, and appropriate solutions should be suggested and put into practice.

Keywords: chemistry, education, science, context-based learning

Procedia PDF Downloads 392
24196 Value Chain Based New Business Opportunity

Authors: Seonjae Lee, Sungjoo Lee

Abstract:

Excavation is necessary to remain competitive in the current business environment. The company survived the rapidly changing industry conditions by adapting new business strategy and reducing technology challenges. Traditionally, the two methods are conducted excavations for new businesses. The first method is, qualitative analysis of expert opinion, which is gathered through opportunities and secondly, new technologies are discovered through quantitative data analysis of method patents. The second method increases time and cost. Patent data is restricted for use and the purpose of discovering business opportunities. This study presents the company's characteristics (sector, size, etc.), of new business opportunities in customized form by reviewing the value chain perspective and to contributing to creating new business opportunities in the proposed model. It utilizes the trademark database of the Korean Intellectual Property Office (KIPO) and proprietary company information database of the Korea Enterprise Data (KED). This data is key to discovering new business opportunities with analysis of competitors and advanced business trademarks (Module 1) and trading analysis of competitors found in the KED (Module 2).

Keywords: value chain, trademark, trading analysis, new business opportunity

Procedia PDF Downloads 357
24195 Towards Addressing the Cultural Snapshot Phenomenon in Cultural Mapping Libraries

Authors: Mousouris Spiridon, Kavakli Evangelia

Abstract:

This paper focuses on Digital Libraries (DLs) that contain and geovisualise cultural data, highlighting the need to define them as a separate category termed Cultural Mapping Libraries, based on their inherent connection of culture with geographic location and their design requirements in support of visual representation of cultural data on the map. An exploratory analysis of DLs that conform to the above definition brought forward the observation that existing Cultural Mapping Libraries fail to geovisualise the entirety of cultural data per point of interest thus resulting in a Cultural Snapshot phenomenon. The existence of this phenomenon was reinforced by the results of a systematic bibliographic research. In order to address the Cultural Snapshot, this paper proposes the use of the Semantic Web principles to efficiently interconnect spatial cultural data through time, per geographic location. In this way points of interest are transformed into scenery where culture evolves over time. This evolution is expressed as occurrences taking place chronologically, in an event oriented approach, a conceptualization also endorsed by the CIDOC Conceptual Reference Model (CIDOC CRM). In particular, we posit the use of CIDOC CRM as the baseline for defining the logic of Cultural Mapping Libraries as part of the Culture Domain in accordance with the Digital Library Reference Model, in order to define the rules of cultural data management by the system. Our future goal is to transform this conceptual definition in to inferencing rules that resolve the Cultural Snapshot and lead to a more complete geovisualisation of cultural data.

Keywords: digital libraries, semantic web, geovisualization, CIDOC-CRM

Procedia PDF Downloads 89
24194 Assessment of ASEI-PDSI Method on Students’ Attitude and Achievement in Junior Secondary Schools Mathematics in FCT-Abuja

Authors: Amenaghawon Clement Osemwinyen

Abstract:

The Activity, Student-centred, Experiment, Improvisation - Plan, Do, See, Improve (ASEI-PDSI) method championed by the Strengthening Mathematics And Science Education (SMASE) - Nigeria Project is an attempt to improve the quality of mathematics, which has consistently declined over the years in both public primary and secondary schools across the country. The study thus assessed the ASEI-PDSI method on students’ attitudes and achievement in junior secondary schools (JSS) mathematics in FCT-Abuja. A survey research design was adopted, and 100 mathematics teachers using a stratified random sampling method were used for the study. The data were collected using structured questionnaires and analyzed using descriptive statistics. The findings showed that the ASEI-PDSI method had significantly improved the attitudes of students toward mathematics. The study also revealed that the ASEI-PDSI method significantly influenced junior secondary school (JSS) students’ mathematics achievement. Amongst the recommendations were that teachers should be encouraged to adopt the ASEI-PDSI method in teaching and learning mathematics in order to create a mathematically stimulating classroom environment which could advertently influence junior secondary school (JSS) students’ attitude and academic performance in mathematics. Also, regular in-service training programs should be organized by stakeholders (government and other interest groups) so as to improve the teaching strategies of teachers, mostly as they affect the ASEI-PDSI method.

Keywords: achievement, ASEI-PDSI method, attitude, mathematics, SMASE

Procedia PDF Downloads 88
24193 An Evaluation of the Impact of E-Banking on Operational Efficiency of Banks in Nigeria

Authors: Ibrahim Rabiu Darazo

Abstract:

The research has been conducted on the impact of E-banking on the operational efficiency of Banks in Nigeria, A case of some selected banks (Diamond Bank Plc, GTBankPlc, and Fidelity Bank Plc) in Nigeria. The research is a quantitative research which uses both primary and secondary sources of data collection. Questionnaire were used to obtained accurate data, where 150 Questionnaire were distributed among staff and customers of the three Banks , and the data collected where analysed using chi-square, whereas the secondary data where obtained from relevant text books, journals and relevant web sites. It is clear from the findings that, the use of e-banking by the banks has improved the efficiency of these banks, in terms of providing efficient services to customers electronically, using Internet Banking, Telephone Banking ATMs, reducing time taking to serve customers, e-banking allow new customers to open an account online, customers have access to their account at all the time 24/7.E-banking provide access to customers information from the data base and cost of check and postage were eliminated using e-banking. The recommendation at the end of the research include; the Banks should try to update their electronic gadgets, e-fraud(internal & external) should also be controlled, Banks shall employ qualified man power, Biometric ATMs shall be introduce to reduce fraud using ATM Cards, as it is use in other countries like USA.

Keywords: banks, electronic banking, operational efficiency of banks, biometric ATMs

Procedia PDF Downloads 310
24192 Suitability of Satellite-Based Data for Groundwater Modelling in Southwest Nigeria

Authors: O. O. Aiyelokun, O. A. Agbede

Abstract:

Numerical modelling of groundwater flow can be susceptible to calibration errors due to lack of adequate ground-based hydro-metrological stations in river basins. Groundwater resources management in Southwest Nigeria is currently challenged by overexploitation, lack of planning and monitoring, urbanization and climate change; hence to adopt models as decision support tools for sustainable management of groundwater; they must be adequately calibrated. Since river basins in Southwest Nigeria are characterized by missing data, and lack of adequate ground-based hydro-meteorological stations; the need for adopting satellite-based data for constructing distributed models is crucial. This study seeks to evaluate the suitability of satellite-based data as substitute for ground-based, for computing boundary conditions; by determining if ground and satellite based meteorological data fit well in Ogun and Oshun River basins. The Climate Forecast System Reanalysis (CFSR) global meteorological dataset was firstly obtained in daily form and converted to monthly form for the period of 432 months (January 1979 to June, 2014). Afterwards, ground-based meteorological data for Ikeja (1981-2010), Abeokuta (1983-2010), and Oshogbo (1981-2010) were compared with CFSR data using Goodness of Fit (GOF) statistics. The study revealed that based on mean absolute error (MEA), coefficient of correlation, (r) and coefficient of determination (R²); all meteorological variables except wind speed fit well. It was further revealed that maximum and minimum temperature, relative humidity and rainfall had high range of index of agreement (d) and ratio of standard deviation (rSD), implying that CFSR dataset could be used to compute boundary conditions such as groundwater recharge and potential evapotranspiration. The study concluded that satellite-based data such as the CFSR should be used as input when constructing groundwater flow models in river basins in Southwest Nigeria, where majority of the river basins are partially gaged and characterized with long missing hydro-metrological data.

Keywords: boundary condition, goodness of fit, groundwater, satellite-based data

Procedia PDF Downloads 113
24191 Geometry, the language of Manifestation of Tabriz School’s Mystical Thoughts in Architecture (Case Study: Dome of Soltanieh)

Authors: Lida Balilan, Dariush Sattarzadeh, Rana Koorepaz

Abstract:

In the Ilkhanid era, the mystical school of Tabriz manifested itself as an art school in various aspects, including miniatures, architecture, urban planning and design, simultaneously with the expansion of the many sciences of its time. In this era, mysticism, both in form and in poetry and prose, as well as in works of art reached its peak. Mysticism, as an inner belief and thought, brought the audience to the artistic and aesthetical view using allegorical and symbolic expression of the religion and had a direct impact on the formation of the intellectual and cultural layers of the society. At the same time, Mystic school of Tabriz could create a symbolic and allegorical language to create magnificent works of architecture with the expansion of science in various fields and using various sciences such as mathematics, geometry, science of numbers and by Abjad letters. In this era, geometry is the middle link between mysticism and architecture and it is divided into two categories, including intellectual and sensory geometry and based on its function. Soltaniyeh dome is one of the prominent buildings of the Tabriz school with the shrine land use. In this article, information is collected using a historical-interpretive method and the results are analyzed using an analytical-comparative method. The results of the study suggest that the designers and builders of the Soltaniyeh dome have used shapes, colors, numbers, letters and words in the form of motifs, geometric patterns as well as lines and writings in levels and layers ranging from plans to decorations and arrays for architectural symbolization and encryption to express and transmit mystical ideas.

Keywords: geometry, Tabriz school, mystical thoughts, dome of Soltaniyeh

Procedia PDF Downloads 72