Search results for: incremental and interactive mining.
Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 950

Search results for: incremental and interactive mining.

110 Optimizing Spatial Trend Detection By Artificial Immune Systems

Authors: M. Derakhshanfar, B. Minaei-Bidgoli

Abstract:

Spatial trends are one of the valuable patterns in geo databases. They play an important role in data analysis and knowledge discovery from spatial data. A spatial trend is a regular change of one or more non spatial attributes when spatially moving away from a start object. Spatial trend detection is a graph search problem therefore heuristic methods can be good solution. Artificial immune system (AIS) is a special method for searching and optimizing. AIS is a novel evolutionary paradigm inspired by the biological immune system. The models based on immune system principles, such as the clonal selection theory, the immune network model or the negative selection algorithm, have been finding increasing applications in fields of science and engineering. In this paper, we develop a novel immunological algorithm based on clonal selection algorithm (CSA) for spatial trend detection. We are created neighborhood graph and neighborhood path, then select spatial trends that their affinity is high for antibody. In an evolutionary process with artificial immune algorithm, affinity of low trends is increased with mutation until stop condition is satisfied.

Keywords: Spatial Data Mining, Spatial Trend Detection, Heuristic Methods, Artificial Immune System, Clonal Selection Algorithm (CSA)

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2019
109 Classifying Biomedical Text Abstracts based on Hierarchical 'Concept' Structure

Authors: Rozilawati Binti Dollah, Masaki Aono

Abstract:

Classifying biomedical literature is a difficult and challenging task, especially when a large number of biomedical articles should be organized into a hierarchical structure. In this paper, we present an approach for classifying a collection of biomedical text abstracts downloaded from Medline database with the help of ontology alignment. To accomplish our goal, we construct two types of hierarchies, the OHSUMED disease hierarchy and the Medline abstract disease hierarchies from the OHSUMED dataset and the Medline abstracts, respectively. Then, we enrich the OHSUMED disease hierarchy before adapting it to ontology alignment process for finding probable concepts or categories. Subsequently, we compute the cosine similarity between the vector in probable concepts (in the “enriched" OHSUMED disease hierarchy) and the vector in Medline abstract disease hierarchies. Finally, we assign category to the new Medline abstracts based on the similarity score. The results obtained from the experiments show the performance of our proposed approach for hierarchical classification is slightly better than the performance of the multi-class flat classification.

Keywords: Biomedical literature, hierarchical text classification, ontology alignment, text mining.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1988
108 Coastal Vulnerability Index and Its Projection for Odisha Coast, East Coast of India

Authors: Bishnupriya Sahoo, Prasad K. Bhaskaran

Abstract:

Tropical cyclone is one among the worst natural hazards that results in a trail of destruction causing enormous damage to life, property, and coastal infrastructures. In a global perspective, the Indian Ocean is considered as one of the cyclone prone basins in the world. Specifically, the frequency of cyclogenesis in the Bay of Bengal is higher compared to the Arabian Sea. Out of the four maritime states in the East coast of India, Odisha is highly susceptible to tropical cyclone landfall. Historical records clearly decipher the fact that the frequency of cyclones have reduced in this basin. However, in the recent decades, the intensity and size of tropical cyclones have increased. This is a matter of concern as the risk and vulnerability level of Odisha coast exposed to high wind speed and gusts during cyclone landfall have increased. In this context, there is a need to assess and evaluate the severity of coastal risk, area of exposure under risk, and associated vulnerability with a higher dimension in a multi-risk perspective. Changing climate can result in the emergence of a new hazard and vulnerability over a region with differential spatial and socio-economic impact. Hence there is a need to have coastal vulnerability projections in a changing climate scenario. With this motivation, the present study attempts to estimate the destructiveness of tropical cyclones based on Power Dissipation Index (PDI) for those cyclones that made landfall along Odisha coast that exhibits an increasing trend based on historical data. The study also covers the futuristic scenarios of integral coastal vulnerability based on the trends in PDI for the Odisha coast. This study considers 11 essential and important parameters; the cyclone intensity, storm surge, onshore inundation, mean tidal range, continental shelf slope, topo-graphic elevation onshore, rate of shoreline change, maximum wave height, relative sea level rise, rainfall distribution, and coastal geomorphology. The study signifies that over a decadal scale, the coastal vulnerability index (CVI) depends largely on the incremental change in variables such as cyclone intensity, storm surge, and associated inundation. In addition, the study also performs a critical analysis on the modulation of PDI on storm surge and inundation characteristics for the entire coastal belt of Odisha State. Interestingly, the study brings to light that a linear correlation exists between the storm-tide with PDI. The trend analysis of PDI and its projection for coastal Odisha have direct practical applications in effective coastal zone management and vulnerability assessment.

Keywords: Bay of Bengal, coastal vulnerability index, power dissipation index, tropical cyclone.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1269
107 Interaction of Elevated Carbon Dioxide and Temperature on Strawberry (Fragaria × ananassa) Growth and Fruit Yield

Authors: Himali N. Balasooriya, Kithsiri B. Dassanayake, Saman Seneweera, Said Ajlouni

Abstract:

Increase in atmospheric CO2 concentration [CO2] and ambient temperature associated with changing climatic conditions will have significant impacts on agriculture crop productivity and quality. Independent effects of the above two environmental variables on the growth, yield and quality of strawberry were well documented. Higher temperatures over the optimum range (20-25ºC) lead to crop failures, while elevated [CO2] stimulated plant growth and yield but compromised the physical quality of fruits. However, there is very limited understanding of the interaction between these variables on the plant growth, yield and quality. Therefore, this study was designed to investigate the interactive effect of high temperature and elevated [CO2] on growth, yield and quality of strawberries. Strawberry cultivars ‘Albion’ and ‘San Andreas’ were grown under six different combinations of two temperatures (25 and 30ºC) and three [CO2] (400, 650 and 950 µmol mol-1) in controlled-environmental growth chambers. Plant growth measurements such as plant height, canopy area, number of flowers, and fruit yield were measured during phonological development. Photosynthesis and transpiration, the ratio of intercellular to atmospheric [CO2] (Ci/Ca) were measured to estimate the physiological adjustment to climate stress. The impact of temperature and [CO2] interaction on growth and yield of strawberry was significant (p < 0.05). Across both cultivars, highest fruit yields were observed at 650 µmol mol-1 [CO2], which was particularly clear at 25°C. The fruit yield gradually decreased at 30°C under all the treatment combinations. However, photosynthesis rates were highest at 650 µmol mol-1 [CO2] but no increment was found at 900 µmol mol-1 [CO2]. Interestingly, Ci/Ca ratio increased with increasing atmospheric [CO2] which was predominant at high temperature. Similarly, fruit yield was substantially reduced at high [CO2] under high temperature. Our findings suggest that increased Ci/Ca ratio at high temperature is likely reduces the photosynthesis and thus yield response to elevated [CO2].

Keywords: Atmospheric [CO2], fruit yield, strawberry, temperature.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 933
106 Parkinsons Disease Classification using Neural Network and Feature Selection

Authors: Anchana Khemphila, Veera Boonjing

Abstract:

In this study, the Multi-Layer Perceptron (MLP)with Back-Propagation learning algorithm are used to classify to effective diagnosis Parkinsons disease(PD).It-s a challenging problem for medical community.Typically characterized by tremor, PD occurs due to the loss of dopamine in the brains thalamic region that results in involuntary or oscillatory movement in the body. A feature selection algorithm along with biomedical test values to diagnose Parkinson disease.Clinical diagnosis is done mostly by doctor-s expertise and experience.But still cases are reported of wrong diagnosis and treatment. Patients are asked to take number of tests for diagnosis.In many cases,not all the tests contribute towards effective diagnosis of a disease.Our work is to classify the presence of Parkinson disease with reduced number of attributes.Original,22 attributes are involved in classify.We use Information Gain to determine the attributes which reduced the number of attributes which is need to be taken from patients.The Artificial neural networks is used to classify the diagnosis of patients.Twenty-Two attributes are reduced to sixteen attributes.The accuracy is in training data set is 82.051% and in the validation data set is 83.333%.

Keywords: Data mining, classification, Parkinson disease, artificial neural networks, feature selection, information gain.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 3737
105 A New DIDS Design Based on a Combination Feature Selection Approach

Authors: Adel Sabry Eesa, Adnan Mohsin Abdulazeez Brifcani, Zeynep Orman

Abstract:

Feature selection has been used in many fields such as classification, data mining and object recognition and proven to be effective for removing irrelevant and redundant features from the original dataset. In this paper, a new design of distributed intrusion detection system using a combination feature selection model based on bees and decision tree. Bees algorithm is used as the search strategy to find the optimal subset of features, whereas decision tree is used as a judgment for the selected features. Both the produced features and the generated rules are used by Decision Making Mobile Agent to decide whether there is an attack or not in the networks. Decision Making Mobile Agent will migrate through the networks, moving from node to another, if it found that there is an attack on one of the nodes, it then alerts the user through User Interface Agent or takes some action through Action Mobile Agent. The KDD Cup 99 dataset is used to test the effectiveness of the proposed system. The results show that even if only four features are used, the proposed system gives a better performance when it is compared with the obtained results using all 41 features.

Keywords: Distributed intrusion detection system, mobile agent, feature selection, Bees Algorithm, decision tree.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1902
104 Improving Spatiotemporal Change Detection: A High Level Fusion Approach for Discovering Uncertain Knowledge from Satellite Image Database

Authors: Wadii Boulila, Imed Riadh Farah, Karim Saheb Ettabaa, Basel Solaiman, Henda Ben Ghezala

Abstract:

This paper investigates the problem of tracking spa¬tiotemporal changes of a satellite image through the use of Knowledge Discovery in Database (KDD). The purpose of this study is to help a given user effectively discover interesting knowledge and then build prediction and decision models. Unfortunately, the KDD process for spatiotemporal data is always marked by several types of imperfections. In our paper, we take these imperfections into consideration in order to provide more accurate decisions. To achieve this objective, different KDD methods are used to discover knowledge in satellite image databases. Each method presents a different point of view of spatiotemporal evolution of a query model (which represents an extracted object from a satellite image). In order to combine these methods, we use the evidence fusion theory which considerably improves the spatiotemporal knowledge discovery process and increases our belief in the spatiotemporal model change. Experimental results of satellite images representing the region of Auckland in New Zealand depict the improvement in the overall change detection as compared to using classical methods.

Keywords: Knowledge discovery in satellite databases, knowledge fusion, data imperfection, data mining, spatiotemporal change detection.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1515
103 Comparing Academically Gifted and Non-Gifted Students- Supportive Environments in Jordan

Authors: Mustafa Qaseem Hielat, Ahmad Mohammad Al-Shabatat

Abstract:

Jordan exerts many efforts to nurture their academically gifted students in special schools since 2001. During the past nine years of launching these schools, their learning and excellence environments were believed to be distinguished compared to public schools. This study investigated the environments of gifted students compared with other non-gifted, using a survey instrument that measures the dimensions of family, peers, teachers, school- support, society, and resources –dimensions rooted deeply in supporting gifted education, learning, and achievement. A total number of 109 were selected from excellence schools for academically gifted students, and 119 non-gifted students were selected from public schools. Around 8.3% of the non-gifted students reported that they “Never" received any support from their surrounding environments, 14.9% reported “Seldom" support, 23.7% reported “ Often" support, 26.0% reported “Frequent" support, and 32.8% reported “Very frequent" support. Where the gifted students reported more “Never" support than the non-gifted did with 11.3%, “Seldom" support with 15.4%, “Often" support with 26.6%, “Frequent" support with 29.0%, and reported “Very frequent" support less than the non-gifted students with 23.6%. Unexpectedly, statistical differences were found between the two groups favoring non-gifted students in perception of their surrounding environments in specific dimensions, namely, school- support, teachers, and society. No statistical differences were found in the other dimensions of the survey, namely, family, peers, and resources. As the differences were found in teachers, school- support, and society, the nurturing environments for the excellence schools need to be revised to adopt more creative teaching styles, rich school atmosphere and infrastructures, interactive guiding for the students and their parents, promoting for the excellence environments, and re-build successful identification models. Thus, families, schools, and society should increase their cooperation, communication, and awareness of the gifted supportive environments. However, more studies to investigate other aspects of promoting academic giftedness and excellence are recommended.

Keywords: Academic giftedness, Supportive environment, Excellence schools, Gifted grouping, Gifted nurturing.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1845
102 Protein Secondary Structure Prediction Using Parallelized Rule Induction from Coverings

Authors: Leong Lee, Cyriac Kandoth, Jennifer L. Leopold, Ronald L. Frank

Abstract:

Protein 3D structure prediction has always been an important research area in bioinformatics. In particular, the prediction of secondary structure has been a well-studied research topic. Despite the recent breakthrough of combining multiple sequence alignment information and artificial intelligence algorithms to predict protein secondary structure, the Q3 accuracy of various computational prediction algorithms rarely has exceeded 75%. In a previous paper [1], this research team presented a rule-based method called RT-RICO (Relaxed Threshold Rule Induction from Coverings) to predict protein secondary structure. The average Q3 accuracy on the sample datasets using RT-RICO was 80.3%, an improvement over comparable computational methods. Although this demonstrated that RT-RICO might be a promising approach for predicting secondary structure, the algorithm-s computational complexity and program running time limited its use. Herein a parallelized implementation of a slightly modified RT-RICO approach is presented. This new version of the algorithm facilitated the testing of a much larger dataset of 396 protein domains [2]. Parallelized RTRICO achieved a Q3 score of 74.6%, which is higher than the consensus prediction accuracy of 72.9% that was achieved for the same test dataset by a combination of four secondary structure prediction methods [2].

Keywords: data mining, protein secondary structure prediction, parallelization.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1568
101 Optimization of Samarium Extraction via Nanofluid-Based Emulsion Liquid Membrane Using Cyanex 272 as Mobile Carrier

Authors: Maliheh Raji, Hossein Abolghasemi, Jaber Safdari, Ali Kargari

Abstract:

Samarium as a rare-earth element is playing a growing important role in high technology. Traditional methods for extraction of rare earth metals such as ion exchange and solvent extraction have disadvantages of high investment and high energy consumption. Emulsion liquid membrane (ELM) as an improved solvent extraction technique is an effective transport method for separation of various compounds from aqueous solutions. In this work, the extraction of samarium from aqueous solutions by ELM was investigated using response surface methodology (RSM). The organic membrane phase of the ELM was a nanofluid consisted of multiwalled carbon nanotubes (MWCNT), Span80 as surfactant, Cyanex 272 as mobile carrier, and kerosene as base fluid. 1 M nitric acid solution was used as internal aqueous phase. The effects of the important process parameters on samarium extraction were investigated, and the values of these parameters were optimized using the Central Composition Design (CCD) of RSM. These parameters were the concentration of MWCNT in nanofluid, the carrier concentration, and the volume ratio of organic membrane phase to internal phase (Roi). The three-dimensional (3D) response surfaces of samarium extraction efficiency were obtained to visualize the individual and interactive effects of the process variables. A regression model for % extraction was developed, and its adequacy was evaluated. The result shows that % extraction improves by using MWCNT nanofluid in organic membrane phase and extraction efficiency of 98.92% can be achieved under the optimum conditions. In addition, demulsification was successfully performed and the recycled membrane phase was proved to be effective in the optimum condition.

Keywords: Cyanex 272, emulsion liquid membrane, multiwalled carbon nanotubes, nanofluid, response surface methodology, Samarium.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1813
100 Designing a Framework for Network Security Protection

Authors: Eric P. Jiang

Abstract:

As the Internet continues to grow at a rapid pace as the primary medium for communications and commerce and as telecommunication networks and systems continue to expand their global reach, digital information has become the most popular and important information resource and our dependence upon the underlying cyber infrastructure has been increasing significantly. Unfortunately, as our dependency has grown, so has the threat to the cyber infrastructure from spammers, attackers and criminal enterprises. In this paper, we propose a new machine learning based network intrusion detection framework for cyber security. The detection process of the framework consists of two stages: model construction and intrusion detection. In the model construction stage, a semi-supervised machine learning algorithm is applied to a collected set of network audit data to generate a profile of normal network behavior and in the intrusion detection stage, input network events are analyzed and compared with the patterns gathered in the profile, and some of them are then flagged as anomalies should these events are sufficiently far from the expected normal behavior. The proposed framework is particularly applicable to the situations where there is only a small amount of labeled network training data available, which is very typical in real world network environments.

Keywords: classification, data analysis and mining, network intrusion detection, semi-supervised learning.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1766
99 Nanopaper Innovation in Paper and Packaging Industry

Authors: Hajar Mohammadpour Kachlami , Ghasem Javadzadeh Moghtader , Habib Mohammadpour Kachlami

Abstract:

Nowadays due to globalization of economy and competition environment, innovation and technology plays key role at creation of wealth and economic growth of countries. In fact prompt growth of practical and technologic knowledge may results in social benefits for countries when changes into effective innovation. Considering the importance of innovation for the development of countries, this study addresses the radical technological innovation introduced by nanopapers at different stages of producing paper including stock preparation, using authorized additives, fillers and pigments, using retention, calender, stages of producing conductive paper, porous nanopaper and Layer by layer self-assembly. Research results show that in coming years the jungle related products will lose considerable portion of their market share, unless embracing radical innovation. Although incremental innovations can make this industry still competitive in mid-term, but to have economic growth and competitive advantage in long term, radical innovations are necessary. Radical innovations can lead to new products and materials which their applications in packaging industry can produce value added. However application of nanotechnology in this industry can be costly, it can be done in cooperation with other industries to make the maximum use of nanotechnology possible. Therefore this technology can be used in all the production process resulting in the mass production of simple and flexible papers with low cost and special properties such as facility at shape, form, easy transportation, light weight, recovery and recycle marketing abilities, and sealing. Improving the resistance of the packaging materials without reducing the performance of packaging materials enhances the quality and the value added of packaging. Improving the cellulose at nano scale can have considerable electron optical and magnetic effects leading to improvement in packaging and value added. Comparing to the specifications of thermoplastic products and ordinary papers, nanopapers show much better performance in terms of effective mechanical indexes such as the modulus of elasticity, tensile strength, and strain-stress. In densities lower than 640 kgm -3, due to the network structure of nanofibers and the balanced and randomized distribution of NFC in flat space, these specifications will even improve more. For nanopapers, strains are 1,4Gpa, 84Mpa and 17%, 13,3 Gpa, 214Mpa and 10% respectively. In layer by layer self assembly method (LbL) the tensile strength of nanopaper with Tio3 particles and Sio2 and halloysite clay nanotube are 30,4 ±7.6Nm/g and 13,6 ±0.8Nm/g and 14±0.3,3Nm/g respectively that fall within acceptable range of similar samples with virgin fiber. The usage of improved brightness and porosity index in nanopapers can create more competitive advantages at packaging industry.

Keywords: Innovation; NanoPaper; Nanofiber; Packaging

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 3088
98 Mixtures of Monotone Networks for Prediction

Authors: Marina Velikova, Hennie Daniels, Ad Feelders

Abstract:

In many data mining applications, it is a priori known that the target function should satisfy certain constraints imposed by, for example, economic theory or a human-decision maker. In this paper we consider partially monotone prediction problems, where the target variable depends monotonically on some of the input variables but not on all. We propose a novel method to construct prediction models, where monotone dependences with respect to some of the input variables are preserved by virtue of construction. Our method belongs to the class of mixture models. The basic idea is to convolute monotone neural networks with weight (kernel) functions to make predictions. By using simulation and real case studies, we demonstrate the application of our method. To obtain sound assessment for the performance of our approach, we use standard neural networks with weight decay and partially monotone linear models as benchmark methods for comparison. The results show that our approach outperforms partially monotone linear models in terms of accuracy. Furthermore, the incorporation of partial monotonicity constraints not only leads to models that are in accordance with the decision maker's expertise, but also reduces considerably the model variance in comparison to standard neural networks with weight decay.

Keywords: mixture models, monotone neural networks, partially monotone models, partially monotone problems.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1214
97 A Methodology for Investigating Public Opinion Using Multilevel Text Analysis

Authors: William Xiu Shun Wong, Myungsu Lim, Yoonjin Hyun, Chen Liu, Seongi Choi, Dasom Kim, Kee-Young Kwahk, Namgyu Kim

Abstract:

Recently, many users have begun to frequently share their opinions on diverse issues using various social media. Therefore, numerous governments have attempted to establish or improve national policies according to the public opinions captured from various social media. In this paper, we indicate several limitations of the traditional approaches to analyze public opinion on science and technology and provide an alternative methodology to overcome these limitations. First, we distinguish between the science and technology analysis phase and the social issue analysis phase to reflect the fact that public opinion can be formed only when a certain science and technology is applied to a specific social issue. Next, we successively apply a start list and a stop list to acquire clarified and interesting results. Finally, to identify the most appropriate documents that fit with a given subject, we develop a new logical filter concept that consists of not only mere keywords but also a logical relationship among the keywords. This study then analyzes the possibilities for the practical use of the proposed methodology thorough its application to discover core issues and public opinions from 1,700,886 documents comprising SNS, blogs, news, and discussions.

Keywords: Big data, social network analysis, text mining, topic modeling.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1634
96 M2LGP: Mining Multiple Level Gradual Patterns

Authors: Yogi Satrya Aryadinata, Anne Laurent, Michel Sala

Abstract:

Gradual patterns have been studied for many years as they contain precious information. They have been integrated in many expert systems and rule-based systems, for instance to reason on knowledge such as “the greater the number of turns, the greater the number of car crashes”. In many cases, this knowledge has been considered as a rule “the greater the number of turns → the greater the number of car crashes” Historically, works have thus been focused on the representation of such rules, studying how implication could be defined, especially fuzzy implication. These rules were defined by experts who were in charge to describe the systems they were working on in order to turn them to operate automatically. More recently, approaches have been proposed in order to mine databases for automatically discovering such knowledge. Several approaches have been studied, the main scientific topics being: how to determine what is an relevant gradual pattern, and how to discover them as efficiently as possible (in terms of both memory and CPU usage). However, in some cases, end-users are not interested in raw level knowledge, and are rather interested in trends. Moreover, it may be the case that no relevant pattern can be discovered at a low level of granularity (e.g. city), whereas some can be discovered at a higher level (e.g. county). In this paper, we thus extend gradual pattern approaches in order to consider multiple level gradual patterns. For this purpose, we consider two aggregation policies, namely horizontal and vertical.

Keywords: Gradual Pattern.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1469
95 Inductive Grammar, Student-Centered Reading, and Interactive Poetry: The Effects of Teaching English with Fun in Schools of Two Villages in Lebanon

Authors: Talar Agopian

Abstract:

Teaching English as a Second Language (ESL) is a common practice in many Lebanese schools. However, ESL teaching is done in traditional ways. Methods such as constructivism are seldom used, especially in villages. Here lies the significance of this research which joins constructivism and Piaget’s theory of cognitive development in ESL classes in Lebanese villages. The purpose of the present study is to explore the effects of applying constructivist student-centered strategies in teaching grammar, reading comprehension, and poetry on students in elementary ESL classes in two villages in Lebanon, Zefta in South Lebanon and Boqaata in Mount Lebanon. 20 English teachers participated in a training titled “Teaching English with Fun”, which focused on strategies that create a student-centered class where active learning takes place and there is increased learner engagement and autonomy. The training covered three main areas in teaching English: grammar, reading comprehension, and poetry. After participating in the training, the teachers applied the new strategies and methods in their ESL classes. The methodology comprised two phases: in phase one, practice-based research was conducted as the teachers attended the training and applied the constructivist strategies in their respective ESL classes. Phase two included the reflections of the teachers on the effects of the application of constructivist strategies. The results revealed the educational benefits of constructivist student-centered strategies; the students of teachers who applied these strategies showed improved engagement, positive attitudes towards poetry, increased motivation, and a better sense of autonomy. Future research is required in applying constructivist methods in the areas of writing, spelling, and vocabulary in ESL classrooms of Lebanese villages.

Keywords: Active learning, constructivism, learner engagement, student-centered strategies.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 730
94 The Robust Clustering with Reduction Dimension

Authors: Dyah E. Herwindiati

Abstract:

A clustering is process to identify a homogeneous groups of object called as cluster. Clustering is one interesting topic on data mining. A group or class behaves similarly characteristics. This paper discusses a robust clustering process for data images with two reduction dimension approaches; i.e. the two dimensional principal component analysis (2DPCA) and principal component analysis (PCA). A standard approach to overcome this problem is dimension reduction, which transforms a high-dimensional data into a lower-dimensional space with limited loss of information. One of the most common forms of dimensionality reduction is the principal components analysis (PCA). The 2DPCA is often called a variant of principal component (PCA), the image matrices were directly treated as 2D matrices; they do not need to be transformed into a vector so that the covariance matrix of image can be constructed directly using the original image matrices. The decomposed classical covariance matrix is very sensitive to outlying observations. The objective of paper is to compare the performance of robust minimizing vector variance (MVV) in the two dimensional projection PCA (2DPCA) and the PCA for clustering on an arbitrary data image when outliers are hiden in the data set. The simulation aspects of robustness and the illustration of clustering images are discussed in the end of paper

Keywords: Breakdown point, Consistency, 2DPCA, PCA, Outlier, Vector Variance

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1671
93 Sliding Joints and Soil-Structure Interaction

Authors: Radim Cajka, Pavlina Mateckova, Martina Janulikova, Marie Stara

Abstract:

Use of a sliding joint is an effective method to decrease the stress in foundation structure where there is a horizontal deformation of subsoil (areas afflicted with underground mining) or horizontal deformation of a foundation structure (pre-stressed foundations, creep, shrinkage, temperature deformation). A convenient material for a sliding joint is a bitumen asphalt belt. Experiments for different types of bitumen belts were undertaken at the Faculty of Civil Engineering - VSB Technical University of Ostrava in 2008. This year an extension of the 2008 experiments is in progress and the shear resistance of a slide joint is being tested as a function of temperature in a temperature controlled room. In this paper experimental results of temperature dependant shear resistance are presented. The result of the experiments should be the sliding joint shear resistance as a function of deformation velocity and temperature. This relationship is used for numerical analysis of stress/strain relation between foundation structure and subsoil. Using a rheological slide joint could lead to a decrease of the reinforcement amount, and contribute to higher reliability of foundation structure and thus enable design of more durable and sustainable building structures.

Keywords: Pre-stressed foundations, sliding joint, soil-structure interaction, subsoil horizontal deformation.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1986
92 Evaluation of the Urban Regeneration Project: Land Use Transformation and SNS Big Data Analysis

Authors: Ju-Young Kim, Tae-Heon Moon, Jung-Hun Cho

Abstract:

Urban regeneration projects have been actively promoted in Korea. In particular, Jeonju Hanok Village is evaluated as one of representative cases in terms of utilizing local cultural heritage sits in the urban regeneration project. However, recently, there has been a growing concern in this area, due to the ‘gentrification’, caused by the excessive commercialization and surging tourists. This trend was changing land and building use and resulted in the loss of identity of the region. In this regard, this study analyzed the land use transformation between 2010 and 2016 to identify the commercialization trend in Jeonju Hanok Village. In addition, it conducted SNS big data analysis on Jeonju Hanok Village from February 14th, 2016 to March 31st, 2016 to identify visitors’ awareness of the village. The study results demonstrate that rapid commercialization was underway, unlikely the initial intention, so that planners and officials in city government should reconsider the project direction and rebuild deliberate management strategies. This study is meaningful in that it analyzed the land use transformation and SNS big data to identify the current situation in urban regeneration area. Furthermore, it is expected that the study results will contribute to the vitalization of regeneration area.

Keywords: Land use, SNS, text mining, urban regeneration.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1181
91 A Hybrid Feature Selection and Deep Learning Algorithm for Cancer Disease Classification

Authors: Niousha Bagheri Khulenjani, Mohammad Saniee Abadeh

Abstract:

Learning from very big datasets is a significant problem for most present data mining and machine learning algorithms. MicroRNA (miRNA) is one of the important big genomic and non-coding datasets presenting the genome sequences. In this paper, a hybrid method for the classification of the miRNA data is proposed. Due to the variety of cancers and high number of genes, analyzing the miRNA dataset has been a challenging problem for researchers. The number of features corresponding to the number of samples is high and the data suffer from being imbalanced. The feature selection method has been used to select features having more ability to distinguish classes and eliminating obscures features. Afterward, a Convolutional Neural Network (CNN) classifier for classification of cancer types is utilized, which employs a Genetic Algorithm to highlight optimized hyper-parameters of CNN. In order to make the process of classification by CNN faster, Graphics Processing Unit (GPU) is recommended for calculating the mathematic equation in a parallel way. The proposed method is tested on a real-world dataset with 8,129 patients, 29 different types of tumors, and 1,046 miRNA biomarkers, taken from The Cancer Genome Atlas (TCGA) database.

Keywords: Cancer classification, feature selection, deep learning, genetic algorithm.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1205
90 CompPSA: A Component-Based Pairwise RNA Secondary Structure Alignment Algorithm

Authors: Ghada Badr, Arwa Alturki

Abstract:

The biological function of an RNA molecule depends on its structure. The objective of the alignment is finding the homology between two or more RNA secondary structures. Knowing the common functionalities between two RNA structures allows a better understanding and a discovery of other relationships between them. Besides, identifying non-coding RNAs -that is not translated into a protein- is a popular application in which RNA structural alignment is the first step A few methods for RNA structure-to-structure alignment have been developed. Most of these methods are partial structure-to-structure, sequence-to-structure, or structure-to-sequence alignment. Less attention is given in the literature to the use of efficient RNA structure representation and the structure-to-structure alignment methods are lacking. In this paper, we introduce an O(N2) Component-based Pairwise RNA Structure Alignment (CompPSA) algorithm, where structures are given as a component-based representation and where N is the maximum number of components in the two structures. The proposed algorithm compares the two RNA secondary structures based on their weighted component features rather than on their base-pair details. Extensive experiments are conducted illustrating the efficiency of the CompPSA algorithm when compared to other approaches and on different real and simulated datasets. The CompPSA algorithm shows an accurate similarity measure between components. The algorithm gives the flexibility for the user to align the two RNA structures based on their weighted features (position, full length, and/or stem length). Moreover, the algorithm proves scalability and efficiency in time and memory performance.

Keywords: Alignment, RNA secondary structure, pairwise, component-based, data mining.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 945
89 Development of Prediction Models of Day-Ahead Hourly Building Electricity Consumption and Peak Power Demand Using the Machine Learning Method

Authors: Dalin Si, Azizan Aziz, Bertrand Lasternas

Abstract:

To encourage building owners to purchase electricity at the wholesale market and reduce building peak demand, this study aims to develop models that predict day-ahead hourly electricity consumption and demand using artificial neural network (ANN) and support vector machine (SVM). All prediction models are built in Python, with tool Scikit-learn and Pybrain. The input data for both consumption and demand prediction are time stamp, outdoor dry bulb temperature, relative humidity, air handling unit (AHU), supply air temperature and solar radiation. Solar radiation, which is unavailable a day-ahead, is predicted at first, and then this estimation is used as an input to predict consumption and demand. Models to predict consumption and demand are trained in both SVM and ANN, and depend on cooling or heating, weekdays or weekends. The results show that ANN is the better option for both consumption and demand prediction. It can achieve 15.50% to 20.03% coefficient of variance of root mean square error (CVRMSE) for consumption prediction and 22.89% to 32.42% CVRMSE for demand prediction, respectively. To conclude, the presented models have potential to help building owners to purchase electricity at the wholesale market, but they are not robust when used in demand response control.

Keywords: Building energy prediction, data mining, demand response, electricity market.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2180
88 Virulent-GO: Prediction of Virulent Proteins in Bacterial Pathogens Utilizing Gene Ontology Terms

Authors: Chia-Ta Tsai, Wen-Lin Huang, Shinn-Jang Ho, Li-Sun Shu, Shinn-Ying Ho

Abstract:

Prediction of bacterial virulent protein sequences can give assistance to identification and characterization of novel virulence-associated factors and discover drug/vaccine targets against proteins indispensable to pathogenicity. Gene Ontology (GO) annotation which describes functions of genes and gene products as a controlled vocabulary of terms has been shown effectively for a variety of tasks such as gene expression study, GO annotation prediction, protein subcellular localization, etc. In this study, we propose a sequence-based method Virulent-GO by mining informative GO terms as features for predicting bacterial virulent proteins. Each protein in the datasets used by the existing method VirulentPred is annotated by using BLAST to obtain its homologies with known accession numbers for retrieving GO terms. After investigating various popular classifiers using the same five-fold cross-validation scheme, Virulent-GO using the single kind of GO term features with an accuracy of 82.5% is slightly better than VirulentPred with 81.8% using five kinds of sequence-based features. For the evaluation of independent test, Virulent-GO also yields better results (82.0%) than VirulentPred (80.7%). When evaluating single kind of feature with SVM, the GO term feature performs much well, compared with each of the five kinds of features.

Keywords: Bacterial virulence factors, GO terms, prediction, protein sequence.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2162
87 Development of a Technology Assessment Model by Patents and Customers' Review Data

Authors: Kisik Song, Sungjoo Lee

Abstract:

Recent years have seen an increasing number of patent disputes due to excessive competition in the global market and a reduced technology life-cycle; this has increased the risk of investment in technology development. While many global companies have started developing a methodology to identify promising technologies and assess for decisions, the existing methodology still has some limitations. Post hoc assessments of the new technology are not being performed, especially to determine whether the suggested technologies turned out to be promising. For example, in existing quantitative patent analysis, a patent’s citation information has served as an important metric for quality assessment, but this analysis cannot be applied to recently registered patents because such information accumulates over time. Therefore, we propose a new technology assessment model that can replace citation information and positively affect technological development based on post hoc analysis of the patents for promising technologies. Additionally, we collect customer reviews on a target technology to extract keywords that show the customers’ needs, and we determine how many keywords are covered in the new technology. Finally, we construct a portfolio (based on a technology assessment from patent information) and a customer-based marketability assessment (based on review data), and we use them to visualize the characteristics of the new technologies.

Keywords: Technology assessment, patents, citation information, opinion mining.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 958
86 A Methodology for Automatic Diversification of Document Categories

Authors: Dasom Kim, Chen Liu, Myungsu Lim, Soo-Hyeon Jeon, Byeoung Kug Jeon, Kee-Young Kwahk, Namgyu Kim

Abstract:

Recently, numerous documents including large volumes of unstructured data and text have been created because of the rapid increase in the use of social media and the Internet. Usually, these documents are categorized for the convenience of users. Because the accuracy of manual categorization is not guaranteed, and such categorization requires a large amount of time and incurs huge costs. Many studies on automatic categorization have been conducted to help mitigate the limitations of manual categorization. Unfortunately, most of these methods cannot be applied to categorize complex documents with multiple topics because they work on the assumption that individual documents can be categorized into single categories only. Therefore, to overcome this limitation, some studies have attempted to categorize each document into multiple categories. However, the learning process employed in these studies involves training using a multi-categorized document set. These methods therefore cannot be applied to the multi-categorization of most documents unless multi-categorized training sets using traditional multi-categorization algorithms are provided. To overcome this limitation, in this study, we review our novel methodology for extending the category of a single-categorized document to multiple categorizes, and then introduce a survey-based verification scenario for estimating the accuracy of our automatic categorization methodology.

Keywords: Big Data Analysis, Document Classification, Text Mining, Topic Analysis.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1722
85 Recommender Systems Using Ensemble Techniques

Authors: Yeonjeong Lee, Kyoung-jae Kim, Youngtae Kim

Abstract:

This study proposes a novel recommender system that uses data mining and multi-model ensemble techniques to enhance the recommendation performance through reflecting the precise user’s preference. The proposed model consists of two steps. In the first step, this study uses logistic regression, decision trees, and artificial neural networks to predict customers who have high likelihood to purchase products in each product group. Then, this study combines the results of each predictor using the multi-model ensemble techniques such as bagging and bumping. In the second step, this study uses the market basket analysis to extract association rules for co-purchased products. Finally, the system selects customers who have high likelihood to purchase products in each product group and recommends proper products from same or different product groups to them through above two steps. We test the usability of the proposed system by using prototype and real-world transaction and profile data. In addition, we survey about user satisfaction for the recommended product list from the proposed system and the randomly selected product lists. The results also show that the proposed system may be useful in real-world online shopping store.

Keywords: Product recommender system, Ensemble technique, Association rules, Decision tree, Artificial neural networks.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 4189
84 Promoting Creative and Critical Thinking in Mathematics: An Exploratory Study

Authors: A. Breda, C. Cruz

Abstract:

The Japanese art of origami provides a rich context for designing exploratory mathematical activities for children and young people. By folding a simple sheet of paper, fascinating and surprising planar and spatial configurations emerge. Equally surprising is the unfolding process, which also produces striking patterns. The procedure of folding, unfolding, and folding again allows the exploration of interesting geometric patterns. When adequately and systematically done, we may deduce some of the mathematical rules ruling origami. As the child/youth folds the sheet of paper repeatedly, he can physically observe how the forms he obtains are transformed and how they relate to the pattern of the corresponding unfolding, creating space for the understanding/discovery of mathematical principles regulating the folding-unfolding process. As part of a 2023 Summer Academy organized by a Portuguese university, a session entitled “Folding, Thinking and Generalizing” took place. 23 students attended the session, all enrolled in the 2nd cycle of Portuguese Basic Education and aged between 10 and 12 years old. The main focus of this session was to foster the development of critical cognitive and socio-emotional skills among these young learners, using origami. These skills included creativity, critical analysis, mathematical reasoning, collaboration, and communication. Employing a qualitative, descriptive, and interpretative analysis of data, collected during the session through field notes and students’ written productions, our findings reveal that structured origami-based activities not only promote student engagement with mathematical concepts in a playful and interactive but also facilitate the development of socio-emotional skills, which include collaboration and effective communication between participants. This research highlights the value of integrating origami into educational practices, highlighting its role in supporting comprehensive cognitive and emotional learning experiences.

Keywords: Active learning, hands-on activities, origami, creativity, critical thinking.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 45
83 Strategic Mine Planning: A SWOT Analysis Applied to KOV Open Pit Mine in the Democratic Republic of Congo

Authors: Patrick May Mukonki

Abstract:

KOV pit (Kamoto Oliveira Virgule) is located 10 km from Kolwezi town, one of the mineral rich town in the Lualaba province of the Democratic Republic of Congo. The KOV pit is currently operating under the Katanga Mining Limited (KML), a Glencore-Gecamines (a State Owned Company) join venture. Recently, the mine optimization process provided a life of mine of approximately 10 years withnice pushbacks using the Datamine NPV Scheduler software. In previous KOV pit studies, we recently outlined the impact of the accuracy of the geological information on a long-term mine plan for a big copper mine such as KOV pit. The approach taken, discussed three main scenarios and outlined some weaknesses on the geological information side, and now, in this paper that we are going to develop here, we are going to highlight, as an overview, those weaknesses, strengths and opportunities, in a global SWOT analysis. The approach we are taking here is essentially descriptive in terms of steps taken to optimize KOV pit and, at every step, we categorized the challenges we faced to have a better tradeoff between what we called strengths and what we called weaknesses. The same logic is applied in terms of the opportunities and threats. The SWOT analysis conducted in this paper demonstrates that, despite a general poor ore body definition, and very rude ground water conditions, there is room for improvement for such high grade ore body.

Keywords: Mine planning, mine optimization, mine scheduling, SWOT analysis.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1548
82 SNC Based Network Layer Design for Underwater Wireless Communication Used in Coral Farms

Authors: T. T. Manikandan, Rajeev Sukumaran

Abstract:

For maintaining the biodiversity of many ecosystems the existence of coral reefs play a vital role. But due to many factors such as pollution and coral mining, coral reefs are dying day by day. One way to protect the coral reefs is to farm them in a carefully monitored underwater environment and restore it in place of dead corals. For successful farming of corals in coral farms, different parameters of the water in the farming area need to be monitored and maintained at optimal level. Sensing underwater parameters using wireless sensor nodes is an effective way for precise and continuous monitoring in a highly dynamic environment like oceans. Here the sensed information is of varying importance and it needs to be provided with desired Quality of Service(QoS) guarantees in delivering the information to offshore monitoring centers. The main interest of this research is Stochastic Network Calculus (SNC) based modeling of network layer design for underwater wireless sensor communication. The model proposed in this research enforces differentiation of service in underwater wireless sensor communication with the help of buffer sizing and link scheduling. The delay and backlog bounds for such differentiated services are analytically derived using stochastic network calculus.

Keywords: Underwater Coral Farms, SNC, differentiated service, delay bound, backlog bound.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 315
81 Malware Beaconing Detection by Mining Large-scale DNS Logs for Targeted Attack Identification

Authors: Andrii Shalaginov, Katrin Franke, Xiongwei Huang

Abstract:

One of the leading problems in Cyber Security today is the emergence of targeted attacks conducted by adversaries with access to sophisticated tools. These attacks usually steal senior level employee system privileges, in order to gain unauthorized access to confidential knowledge and valuable intellectual property. Malware used for initial compromise of the systems are sophisticated and may target zero-day vulnerabilities. In this work we utilize common behaviour of malware called ”beacon”, which implies that infected hosts communicate to Command and Control servers at regular intervals that have relatively small time variations. By analysing such beacon activity through passive network monitoring, it is possible to detect potential malware infections. So, we focus on time gaps as indicators of possible C2 activity in targeted enterprise networks. We represent DNS log files as a graph, whose vertices are destination domains and edges are timestamps. Then by using four periodicity detection algorithms for each pair of internal-external communications, we check timestamp sequences to identify the beacon activities. Finally, based on the graph structure, we infer the existence of other infected hosts and malicious domains enrolled in the attack activities.

Keywords: Malware detection, network security, targeted attack.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 6015