Commenced in January 2007

Frequency: Monthly

Edition: International

Paper Count: 39

Search results for: Heba Sami Zaky

9 Clustering Categorical Data Using the K-Means Algorithm and the Attribute’s Relative Frequency

Authors: Semeh Ben Salem, Sami Naouali, Moetez Sallami

Abstract:

Clustering is a well known data mining technique used in pattern recognition and information retrieval. The initial dataset to be clustered can either contain categorical or numeric data. Each type of data has its own specific clustering algorithm. In this context, two algorithms are proposed: the k-means for clustering numeric datasets and the k-modes for categorical datasets. The main encountered problem in data mining applications is clustering categorical dataset so relevant in the datasets. One main issue to achieve the clustering process on categorical values is to transform the categorical attributes into numeric measures and directly apply the k-means algorithm instead the k-modes. In this paper, it is proposed to experiment an approach based on the previous issue by transforming the categorical values into numeric ones using the relative frequency of each modality in the attributes. The proposed approach is compared with a previously method based on transforming the categorical datasets into binary values. The scalability and accuracy of the two methods are experimented. The obtained results show that our proposed method outperforms the binary method in all cases.

Keywords: Clustering, k-means, categorical datasets, pattern recognition, unsupervised learning, knowledge discovery.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 3490

8 An Educational Data Mining System for Advising Higher Education Students

Authors: Heba Mohammed Nagy, Walid Mohamed Aly, Osama Fathy Hegazy

Abstract:

Educational data mining is a specific data mining field applied to data originating from educational environments, it relies on different approaches to discover hidden knowledge from the available data. Among these approaches are machine learning techniques which are used to build a system that acquires learning from previous data. Machine learning can be applied to solve different regression, classification, clustering and optimization problems.

In our research, we propose a “Student Advisory Framework” that utilizes classification and clustering to build an intelligent system. This system can be used to provide pieces of consultations to a first year university student to pursue a certain education track where he/she will likely succeed in, aiming to decrease the high rate of academic failure among these students. A real case study in Cairo Higher Institute for Engineering, Computer Science and Management is presented using real dataset collected from 2000−2012.The dataset has two main components: pre-higher education dataset and first year courses results dataset. Results have proved the efficiency of the suggested framework.

Keywords: Classification, Clustering, Educational Data Mining (EDM), Machine Learning.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 5165

7 Non-Overlapping Hierarchical Index Structure for Similarity Search

Authors: Mounira Taileb, Sid Lamrous, Sami Touati

Abstract:

In order to accelerate the similarity search in highdimensional database, we propose a new hierarchical indexing method. It is composed of offline and online phases. Our contribution concerns both phases. In the offline phase, after gathering the whole of the data in clusters and constructing a hierarchical index, the main originality of our contribution consists to develop a method to construct bounding forms of clusters to avoid overlapping. For the online phase, our idea improves considerably performances of similarity search. However, for this second phase, we have also developed an adapted search algorithm. Our method baptized NOHIS (Non-Overlapping Hierarchical Index Structure) use the Principal Direction Divisive Partitioning (PDDP) as algorithm of clustering. The principle of the PDDP is to divide data recursively into two sub-clusters; division is done by using the hyper-plane orthogonal to the principal direction derived from the covariance matrix and passing through the centroid of the cluster to divide. Data of each two sub-clusters obtained are including by a minimum bounding rectangle (MBR). The two MBRs are directed according to the principal direction. Consequently, the nonoverlapping between the two forms is assured. Experiments use databases containing image descriptors. Results show that the proposed method outperforms sequential scan and SRtree in processing k-nearest neighbors.

Keywords: K-nearest neighbour search, multi-dimensional indexing, multimedia databases, similarity search.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1528

6 Controlling of Multi-Level Inverter under Shading Conditions Using Artificial Neural Network

Authors: Abed Sami Qawasme, Sameer Khader

Abstract:

This paper describes the effects of photovoltaic voltage changes on Multi-level inverter (MLI) due to solar irradiation variations, and methods to overcome these changes. The irradiation variation affects the generated voltage, which in turn varies the switching angles required to turn-on the inverter power switches in order to obtain minimum harmonic content in the output voltage profile. Genetic Algorithm (GA) is used to solve harmonics elimination equations of eleven level inverters with equal and non-equal dc sources. After that artificial neural network (ANN) algorithm is proposed to generate appropriate set of switching angles for MLI at any level of input dc sources voltage causing minimization of the total harmonic distortion (THD) to an acceptable limit. MATLAB/Simulink platform is used as a simulation tool and Fast Fourier Transform (FFT) analyses are carried out for output voltage profile to verify the reliability and accuracy of the applied technique for controlling the MLI harmonic distortion. According to the simulation results, the obtained THD for equal dc source is 9.38%, while for variable or unequal dc sources it varies between 10.26% and 12.93% as the input dc voltage varies between 4.47V nd 11.43V respectively. The proposed ANN algorithm provides satisfied simulation results that match with results obtained by alternative algorithms.

Keywords: Multi level inverter, genetic algorithm, artificial neural network, total harmonic distortion.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 563

5 Non-Local Behavior of a Mixed-Mode Crack in a Functionally Graded Piezoelectric Medium

Authors: Nidhal Jamia, Sami El-Borgi

Abstract:

In this paper, the problem of a mixed-Mode crack embedded in an infinite medium made of a functionally graded piezoelectric material (FGPM) with crack surfaces subjected to electro-mechanical loadings is investigated. Eringen’s non-local theory of elasticity is adopted to formulate the governing electro-elastic equations. The properties of the piezoelectric material are assumed to vary exponentially along a perpendicular plane to the crack. Using Fourier transform, three integral equations are obtained in which the unknown variables are the jumps of mechanical displacements and electric potentials across the crack surfaces. To solve the integral equations, the unknowns are directly expanded as a series of Jacobi polynomials, and the resulting equations solved using the Schmidt method. In contrast to the classical solutions based on the local theory, it is found that no mechanical stress and electric displacement singularities are present at the crack tips when nonlocal theory is employed to investigate the problem. A direct benefit is the ability to use the calculated maximum stress as a fracture criterion. The primary objective of this study is to investigate the effects of crack length, material gradient parameter describing FGPMs, and lattice parameter on the mechanical stress and electric displacement field near crack tips.

Keywords: Functionally graded piezoelectric material, mixed-mode crack, non-local theory, Schmidt method.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 953

4 Effect of Varying Diets on Growth, Development and Survival of Queen Bee (Apis mellifera L.) in Captivity

Authors: Muhammad Anjum Aqueel, Zaighum Abbas, Mubasshir Sohail, Muhammad Abubakar, Hafiz Khurram Shurjeel, Abu Bakar Muhammad Raza, Muhammad Afzal, Sami Ullah

Abstract:

Keeping in view the increasing demand, queen of Apis mellifera L. (Hymenoptera: Apidae) was reared artificially in this experiment at varying diets including royal jelly. Larval duration, pupal duration, weight, and size of pupae were evaluated at different diets including royal jelly. Queen larvae were raised by Doo Little grafting method. Four different diets were mixed with royal jelly and applied to larvae. Fructose, sugar, yeast, and honey were provided to rearing queen larvae along with same amount of royal jelly. Larval and pupal duration were longest (6.15 and 7.5 days, respectively) at yeast and shortest on honey (5.05 and 7.02 days, respectively). Heavier and bigger pupae were recorded on yeast (168.14 mg and 1.76 cm, respectively) followed by diets having sugar and honey. Due to production of heavier and bigger pupae, yeast was considered as best artificial diet for the growing queen larvae. So, in the second part of experiment, different amounts of yeast were provided to growing larvae along with fixed amount (0.5 g) of royal jelly. Survival rates of the larvae and queen bee were 70% and 40% in the 4-g food, 86.7% and 53.3% in the 6-g food, and 76.7% and 50% in the 8-g food. Weight of adult queen bee (1.459±0.191 g) and the number of ovarioles (41.7±21.3) were highest at 8 g of food. Results of this study are helpful for bee-keepers in producing fitter queen bees.

Keywords: Apis melifera L., dietary effect, survival and development, honey bee queen.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1282

3 A Technical Perspective on Roadway Safety in Eastern Province: Data Evaluation and Spatial Analysis

Authors: Muhammad Farhan, Sayed Faruque, Amr Mohammed, Sami Osman, Omar Al-Jabari, Abdul Almojil

Abstract:

Saudi Arabia in recent years has seen drastic increase in traffic related crashes. With population of over 29 million, Saudi Arabia is considered as a fast growing and emerging economy. The rapid population increase and economic growth has resulted in rapid expansion of transportation infrastructure, which has led to increase in road crashes. Saudi Ministry of Interior reported more than 7,000 people killed and 68,000 injured in 2011 ranking Saudi Arabia to be one of the worst worldwide in traffic safety. The traffic safety issues in the country also result in distress to road users and cause and economic loss exceeding 3.7 billion Euros annually. Keeping this in view, the researchers in Saudi Arabia are investigating ways to improve traffic safety conditions in the country. This paper presents a multilevel approach to collect traffic safety related data required to do traffic safety studies in the region. Two highway corridors including King Fahd Highway 39 kilometre and Gulf Cooperation Council Highway 42 kilometre long connecting the cities of Dammam and Khobar were selected as a study area. Traffic data collected included traffic counts, crash data, travel time data, and speed data. The collected data was analysed using geographic information system to evaluate any correlation. Further research is needed to investigate the effectiveness of traffic safety related data when collected in a concerted effort.

Keywords: Crash Data, Data Collection, Traffic Safety.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2308

2 Real-Time Data Stream Partitioning over a Sliding Window in Real-Time Spatial Big Data

Authors: Sana Hamdi, Emna Bouazizi, Sami Faiz

Abstract:

In recent years, real-time spatial applications, like location-aware services and traffic monitoring, have become more and more important. Such applications result dynamic environments where data as well as queries are continuously moving. As a result, there is a tremendous amount of real-time spatial data generated every day. The growth of the data volume seems to outspeed the advance of our computing infrastructure. For instance, in real-time spatial Big Data, users expect to receive the results of each query within a short time period without holding in account the load of the system. But with a huge amount of real-time spatial data generated, the system performance degrades rapidly especially in overload situations. To solve this problem, we propose the use of data partitioning as an optimization technique. Traditional horizontal and vertical partitioning can increase the performance of the system and simplify data management. But they remain insufficient for real-time spatial Big data; they can’t deal with real-time and stream queries efficiently. Thus, in this paper, we propose a novel data partitioning approach for real-time spatial Big data named VPA-RTSBD (Vertical Partitioning Approach for Real-Time Spatial Big data). This contribution is an implementation of the Matching algorithm for traditional vertical partitioning. We find, firstly, the optimal attribute sequence by the use of Matching algorithm. Then, we propose a new cost model used for database partitioning, for keeping the data amount of each partition more balanced limit and for providing a parallel execution guarantees for the most frequent queries. VPA-RTSBD aims to obtain a real-time partitioning scheme and deals with stream data. It improves the performance of query execution by maximizing the degree of parallel execution. This affects QoS (Quality Of Service) improvement in real-time spatial Big Data especially with a huge volume of stream data. The performance of our contribution is evaluated via simulation experiments. The results show that the proposed algorithm is both efficient and scalable, and that it outperforms comparable algorithms.

Keywords: Real-Time Spatial Big Data, Quality Of Service, Vertical partitioning, Horizontal partitioning, Matching algorithm, Hamming distance, Stream query.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1009

1 Influence of Deficient Materials on the Reliability of Reinforced Concrete Members

Authors: Sami W. Tabsh

Abstract:

The strength of reinforced concrete depends on the member dimensions and material properties. The properties of concrete and steel materials are not constant but random variables. The variability of concrete strength is due to batching errors, variations in mixing, cement quality uncertainties, differences in the degree of compaction and disparity in curing. Similarly, the variability of steel strength is attributed to the manufacturing process, rolling conditions, characteristics of base material, uncertainties in chemical composition, and the microstructure-property relationships. To account for such uncertainties, codes of practice for reinforced concrete design impose resistance factors to ensure structural reliability over the useful life of the structure. In this investigation, the effects of reductions in concrete and reinforcing steel strengths from the nominal values, beyond those accounted for in the structural design codes, on the structural reliability are assessed. The considered limit states are flexure, shear and axial compression based on the ACI 318-11 structural concrete building code. Structural safety is measured in terms of a reliability index. Probabilistic resistance and load models are compiled from the available literature. The study showed that there is a wide variation in the reliability index for reinforced concrete members designed for flexure, shear or axial compression, especially when the live-to-dead load ratio is low. Furthermore, variations in concrete strength have minor effect on the reliability of beams in flexure, moderate effect on the reliability of beams in shear, and sever effect on the reliability of columns in axial compression. On the other hand, changes in steel yield strength have great effect on the reliability of beams in flexure, moderate effect on the reliability of beams in shear, and mild effect on the reliability of columns in axial compression. Based on the outcome, it can be concluded that the reliability of beams is sensitive to changes in the yield strength of the steel reinforcement, whereas the reliability of columns is sensitive to variations in the concrete strength. Since the embedded target reliability in structural design codes results in lower structural safety in beams than in columns, large reductions in material strengths compromise the structural safety of beams much more than they affect columns.

Keywords: Code, flexure, limit states, random variables, reinforced concrete, reliability, reliability index, shear, structural safety.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2535