Search results for: k-nearest neighbors
35 Performance Analysis of Flooding Attack Prevention Algorithm in MANETs
Authors: Revathi Venkataraman, M. Pushpalatha, T. Rama Rao
Abstract:
The lack of any centralized infrastructure in mobile ad hoc networks (MANET) is one of the greatest security concerns in the deployment of wireless networks. Thus communication in MANET functions properly only if the participating nodes cooperate in routing without any malicious intention. However, some of the nodes may be malicious in their behavior, by indulging in flooding attacks on their neighbors. Some others may act malicious by launching active security attacks like denial of service. This paper addresses few related works done on trust evaluation and establishment in ad hoc networks. Related works on flooding attack prevention are reviewed. A new trust approach based on the extent of friendship between the nodes is proposed which makes the nodes to co-operate and prevent flooding attacks in an ad hoc environment. The performance of the trust algorithm is tested in an ad hoc network implementing the Ad hoc On-demand Distance Vector (AODV) protocol.Keywords: AODV, Flooding, MANETs, trust estimation
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 238534 Machine Learning for Aiding Meningitis Diagnosis in Pediatric Patients
Authors: Karina Zaccari, Ernesto Cordeiro Marujo
Abstract:
This paper presents a Machine Learning (ML) approach to support Meningitis diagnosis in patients at a children’s hospital in Sao Paulo, Brazil. The aim is to use ML techniques to reduce the use of invasive procedures, such as cerebrospinal fluid (CSF) collection, as much as possible. In this study, we focus on predicting the probability of Meningitis given the results of a blood and urine laboratory tests, together with the analysis of pain or other complaints from the patient. We tested a number of different ML algorithms, including: Adaptative Boosting (AdaBoost), Decision Tree, Gradient Boosting, K-Nearest Neighbors (KNN), Logistic Regression, Random Forest and Support Vector Machines (SVM). Decision Tree algorithm performed best, with 94.56% and 96.18% accuracy for training and testing data, respectively. These results represent a significant aid to doctors in diagnosing Meningitis as early as possible and in preventing expensive and painful procedures on some children.
Keywords: Machine learning, medical diagnosis, meningitis detection, gradient boosting.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 111033 Intelligent Swarm-Finding in Formation Control of Multi-Robots to Track a Moving Target
Authors: Anh Duc Dang, Joachim Horn
Abstract:
This paper presents a new approach to control robots, which can quickly find their swarm while tracking a moving target through the obstacles of the environment. In this approach, an artificial potential field is generated between each free-robot and the virtual attractive point of the swarm. This artificial potential field will lead free-robots to their swarm. The swarm-finding of these free-robots dose not influence the general motion of their swarm and nor other robots. When one singular robot approaches the swarm then its swarm-search will finish, and it will further participate with its swarm to reach the position of the target. The connections between member-robots with their neighbors are controlled by the artificial attractive/repulsive force field between them to avoid collisions and keep the constant distances between them in ordered formation. The effectiveness of the proposed approach has been verified in simulations.
Keywords: Formation control, potential field method, obstacle avoidance, swarm intelligence, multi-agent systems.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 218732 Image Mapping with Cumulative Distribution Function for Quick Convergence of Counter Propagation Neural Networks in Image Compression
Authors: S. Anna Durai, E. Anna Saro
Abstract:
In general the images used for compression are of different types like dark image, high intensity image etc. When these images are compressed using Counter Propagation Neural Network, it takes longer time to converge. The reason for this is that the given image may contain a number of distinct gray levels with narrow difference with their neighborhood pixels. If the gray levels of the pixels in an image and their neighbors are mapped in such a way that the difference in the gray levels of the neighbor with the pixel is minimum, then compression ratio as well as the convergence of the network can be improved. To achieve this, a Cumulative Distribution Function is estimated for the image and it is used to map the image pixels. When the mapped image pixels are used the Counter Propagation Neural Network yield high compression ratio as well as it converges quickly.Keywords: Correlation, Counter Propagation Neural Networks, Cummulative Distribution Function, Image compression.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 166931 Consensus of Multi-Agent Systems under the Special Consensus Protocols
Authors: Konghe Xie
Abstract:
Two consensus problems are considered in this paper. One is the consensus of linear multi-agent systems with weakly connected directed communication topology. The other is the consensus of nonlinear multi-agent systems with strongly connected directed communication topology. For the first problem, a simplified consensus protocol is designed: Each child agent can only communicate with one of its neighbors. That is, the real communication topology is a directed spanning tree of the original communication topology and without any cycles. Then, the necessary and sufficient condition is put forward to the multi-agent systems can be reached consensus. It is worth noting that the given conditions do not need any eigenvalue of the corresponding Laplacian matrix of the original directed communication network. For the second problem, the feedback gain is designed in the nonlinear consensus protocol. Then, the sufficient condition is proposed such that the systems can be achieved consensus. Besides, the consensus interval is introduced and analyzed to solve the consensus problem. Finally, two numerical simulations are included to verify the theoretical analysis.Keywords: Consensus, multi-agent systems, directed spanning tree, the Laplacian matrix.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 91430 Spreading Dynamics of a Viral Infection in a Complex Network
Authors: Khemanand Moheeput, Smita S. D. Goorah, Satish K. Ramchurn
Abstract:
We report a computational study of the spreading dynamics of a viral infection in a complex (scale-free) network. The final epidemic size distribution (FESD) was found to be unimodal or bimodal depending on the value of the basic reproductive number R0 . The FESDs occurred on time-scales long enough for intermediate-time epidemic size distributions (IESDs) to be important for control measures. The usefulness of R0 for deciding on the timeliness and intensity of control measures was found to be limited by the multimodal nature of the IESDs and by its inability to inform on the speed at which the infection spreads through the population. A reduction of the transmission probability at the hubs of the scale-free network decreased the occurrence of the larger-sized epidemic events of the multimodal distributions. For effective epidemic control, an early reduction in transmission at the index cell and its neighbors was essential.
Keywords: Basic reproductive number, epidemic control, scalefree network, viral infection.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 172029 Automatic Lip Contour Tracking and Visual Character Recognition for Computerized Lip Reading
Authors: Harshit Mehrotra, Gaurav Agrawal, M.C. Srivastava
Abstract:
Computerized lip reading has been one of the most actively researched areas of computer vision in recent past because of its crime fighting potential and invariance to acoustic environment. However, several factors like fast speech, bad pronunciation, poor illumination, movement of face, moustaches and beards make lip reading difficult. In present work, we propose a solution for automatic lip contour tracking and recognizing letters of English language spoken by speakers using the information available from lip movements. Level set method is used for tracking lip contour using a contour velocity model and a feature vector of lip movements is then obtained. Character recognition is performed using modified k nearest neighbor algorithm which assigns more weight to nearer neighbors. The proposed system has been found to have accuracy of 73.3% for character recognition with speaker lip movements as the only input and without using any speech recognition system in parallel. The approach used in this work is found to significantly solve the purpose of lip reading when size of database is small.Keywords: Contour Velocity Model, Lip Contour Tracking, LipReading, Visual Character Recognition.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 240028 An Improved Greedy Routing Algorithm for Grid using Pheromone-Based Landmarks
Authors: Lada-On Lertsuwanakul, Herwig Unger
Abstract:
This paper objects to extend Jon Kleinberg-s research. He introduced the structure of small-world in a grid and shows with a greedy algorithm using only local information able to find route between source and target in delivery time O(log2n). His fundamental model for distributed system uses a two-dimensional grid with longrange random links added between any two node u and v with a probability proportional to distance d(u,v)-2. We propose with an additional information of the long link nearby, we can find the shorter path. We apply the ant colony system as a messenger distributed their pheromone, the long-link details, in surrounding area. The subsequence forwarding decision has more option to move to, select among local neighbors or send to node has long link closer to its target. Our experiment results sustain our approach, the average routing time by Color Pheromone faster than greedy method.
Keywords: Routing algorithm, Small-World network, Ant Colony Optimization, and Peer-to-peer System.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 185927 A Survey on Opportunistic Routing in Mobile Ad Hoc Networks
Authors: R. Poonkuzhali, M. Y. Sanavullah, A. Sabari, T. Dhivyaa
Abstract:
Opportunistic Routing (OR) increases the transmission reliability and network throughput. Traditional routing protocols preselects one or more predetermined nodes before transmission starts and uses a predetermined neighbor to forward a packet in each hop. The opportunistic routing overcomes the drawback of unreliable wireless transmission by broadcasting one transmission can be overheard by manifold neighbors. The first cooperation-optimal protocol for Multirate OR (COMO) used to achieve social efficiency and prevent the selfish behavior of the nodes. The novel link-correlation-aware OR improves the performance by exploiting the miscellaneous low correlated forward links. Context aware Adaptive OR (CAOR) uses active suppression mechanism to reduce packet duplication. The Context-aware OR (COR) can provide efficient routing in mobile networks. By using Cooperative Opportunistic Routing in Mobile Ad hoc Networks (CORMAN), the problem of opportunistic data transfer can be tackled. While comparing to all the protocols, COMO is the best as it achieves social efficiency and prevents the selfish behavior of the nodes.Keywords: CAOR, COMO, COR, CORMAN, MANET, Opportunistic Routing, Reliability, Throughput.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 188026 A Study about the Distribution of the Spanning Ratios of Yao Graphs
Authors: Maryam Hsaini, Mostafa Nouri-Baygi
Abstract:
A critical problem in wireless sensor networks is limited battery and memory of nodes. Therefore, each node in the network could maintain only a subset of its neighbors to communicate with. This will increase the battery usage in the network because each packet should take more hops to reach its destination. In order to tackle these problems, spanner graphs are defined. Since each node has a small degree in a spanner graph and the distance in the graph is not much greater than its actual geographical distance, spanner graphs are suitable candidates to be used for the topology of a wireless sensor network. In this paper, we study Yao graphs and their behavior for a randomly selected set of points. We generate several random point sets and compare the properties of their Yao graphs with the complete graph. Based on our data sets, we obtain several charts demonstrating how Yao graphs behave for a set of randomly chosen point set. As the results show, the stretch factor of a Yao graph follows a normal distribution. Furthermore, the stretch factor is in average far less than the worst case stretch factor proved for Yao graphs in previous results. Furthermore, we use Yao graph for a realistic point set and study its stretch factor in real world.
Keywords: Wireless sensor network, spanner graph, Yao Graph.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 59725 Enhance Image Transmission Based on DWT with Pixel Interleaver
Authors: Muhanned Alfarras
Abstract:
The recent growth of using multimedia transmission over wireless communication systems, have challenges to protect the data from lost due to wireless channel effect. Images are corrupted due to the noise and fading when transmitted over wireless channel, in wireless channel the image is transmitted block by block, Due to severe fading, entire image blocks can be damaged. The aim of this paper comes out from need to enhance the digital images at the wireless receiver side. Proposed Boundary Interpolation (BI) Algorithm using wavelet, have been adapted here used to reconstruction the lost block in the image at the receiver depend on the correlation between the lost block and its neighbors. New Proposed technique by using Boundary Interpolation (BI) Algorithm using wavelet with Pixel interleaver has been implemented. Pixel interleaver work on distribute the pixel to new pixel position of original image before transmitting the image. The block lost through wireless channel is only effects individual pixel. The lost pixels at the receiver side can be recovered by using Boundary Interpolation (BI) Algorithm using wavelet. The results showed that the New proposed algorithm boundary interpolation (BI) using wavelet with pixel interleaver is better in term of MSE and PSNR.Keywords: Image Transmission, Wavelet, Pixel Interleaver, Boundary Interpolation Algorithm
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 159424 Comparison of Different k-NN Models for Speed Prediction in an Urban Traffic Network
Authors: Seyoung Kim, Jeongmin Kim, Kwang Ryel Ryu
Abstract:
A database that records average traffic speeds measured at five-minute intervals for all the links in the traffic network of a metropolitan city. While learning from this data the models that can predict future traffic speed would be beneficial for the applications such as the car navigation system, building predictive models for every link becomes a nontrivial job if the number of links in a given network is huge. An advantage of adopting k-nearest neighbor (k-NN) as predictive models is that it does not require any explicit model building. Instead, k-NN takes a long time to make a prediction because it needs to search for the k-nearest neighbors in the database at prediction time. In this paper, we investigate how much we can speed up k-NN in making traffic speed predictions by reducing the amount of data to be searched for without a significant sacrifice of prediction accuracy. The rationale behind this is that we had a better look at only the recent data because the traffic patterns not only repeat daily or weekly but also change over time. In our experiments, we build several different k-NN models employing different sets of features which are the current and past traffic speeds of the target link and the neighbor links in its up/down-stream. The performances of these models are compared by measuring the average prediction accuracy and the average time taken to make a prediction using various amounts of data.Keywords: Big data, k-NN, machine learning, traffic speed prediction.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 137623 Comparison of Machine Learning Techniques for Single Imputation on Audiograms
Authors: Sarah Beaver, Renee Bryce
Abstract:
Audiograms detect hearing impairment, but missing values pose problems. This work explores imputations in an attempt to improve accuracy. This work implements Linear Regression, Lasso, Linear Support Vector Regression, Bayesian Ridge, K Nearest Neighbors (KNN), and Random Forest machine learning techniques to impute audiogram frequencies ranging from 125 Hz to 8000 Hz. The data contain patients who had or were candidates for cochlear implants. Accuracy is compared across two different Nested Cross-Validation k values. Over 4000 audiograms were used from 800 unique patients. Additionally, training on data combines and compares left and right ear audiograms versus single ear side audiograms. The accuracy achieved using Root Mean Square Error (RMSE) values for the best models for Random Forest ranges from 4.74 to 6.37. The R2 values for the best models for Random Forest ranges from .91 to .96. The accuracy achieved using RMSE values for the best models for KNN ranges from 5.00 to 7.72. The R2 values for the best models for KNN ranges from .89 to .95. The best imputation models received R2 between .89 to .96 and RMSE values less than 8dB. We also show that the accuracy of classification predictive models performed better with our imputation models versus constant imputations by a two percent increase.
Keywords: Machine Learning, audiograms, data imputations, single imputations.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 15822 Wormhole Attack Detection in Wireless Sensor Networks
Authors: Zaw Tun, Aung Htein Maw
Abstract:
The nature of wireless ad hoc and sensor networks make them very attractive to attackers. One of the most popular and serious attacks in wireless ad hoc networks is wormhole attack and most proposed protocols to defend against this attack used positioning devices, synchronized clocks, or directional antennas. This paper analyzes the nature of wormhole attack and existing methods of defending mechanism and then proposes round trip time (RTT) and neighbor numbers based wormhole detection mechanism. The consideration of proposed mechanism is the RTT between two successive nodes and those nodes- neighbor number which is needed to compare those values of other successive nodes. The identification of wormhole attacks is based on the two faces. The first consideration is that the transmission time between two wormhole attack affected nodes is considerable higher than that between two normal neighbor nodes. The second detection mechanism is based on the fact that by introducing new links into the network, the adversary increases the number of neighbors of the nodes within its radius. This system does not require any specific hardware, has good performance and little overhead and also does not consume extra energy. The proposed system is designed in ad hoc on-demand distance vector (AODV) routing protocol and analysis and simulations of the proposed system are performed in network simulator (ns-2).Keywords: AODV, Wormhole attacks, Wireless ad hoc andsensor networks
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 346821 Delay Preserving Substructures in Wireless Networks Using Edge Difference between a Graph and its Square Graph
Authors: T. N. Janakiraman, J. Janet Lourds Rani
Abstract:
In practice, wireless networks has the property that the signal strength attenuates with respect to the distance from the base station, it could be better if the nodes at two hop away are considered for better quality of service. In this paper, we propose a procedure to identify delay preserving substructures for a given wireless ad-hoc network using a new graph operation G 2 – E (G) = G* (Edge difference of square graph of a given graph and the original graph). This operation helps to analyze some induced substructures, which preserve delay in communication among them. This operation G* on a given graph will induce a graph, in which 1- hop neighbors of any node are at 2-hop distance in the original network. In this paper, we also identify some delay preserving substructures in G*, which are (i) set of all nodes, which are mutually at 2-hop distance in G that will form a clique in G*, (ii) set of nodes which forms an odd cycle C2k+1 in G, will form an odd cycle in G* and the set of nodes which form a even cycle C2k in G that will form two disjoint companion cycles ( of same parity odd/even) of length k in G*, (iii) every path of length 2k+1 or 2k in G will induce two disjoint paths of length k in G*, and (iv) set of nodes in G*, which induces a maximal connected sub graph with radius 1 (which identifies a substructure with radius equal 2 and diameter at most 4 in G). The above delay preserving sub structures will behave as good clusters in the original network.Keywords: Clique, cycles, delay preserving substructures, maximal connected sub graph.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 125320 An Application for Risk of Crime Prediction Using Machine Learning
Authors: Luis Fonseca, Filipe Cabral Pinto, Susana Sargento
Abstract:
The increase of the world population, especially in large urban centers, has resulted in new challenges particularly with the control and optimization of public safety. Thus, in the present work, a solution is proposed for the prediction of criminal occurrences in a city based on historical data of incidents and demographic information. The entire research and implementation will be presented start with the data collection from its original source, the treatment and transformations applied to them, choice and the evaluation and implementation of the Machine Learning model up to the application layer. Classification models will be implemented to predict criminal risk for a given time interval and location. Machine Learning algorithms such as Random Forest, Neural Networks, K-Nearest Neighbors and Logistic Regression will be used to predict occurrences, and their performance will be compared according to the data processing and transformation used. The results show that the use of Machine Learning techniques helps to anticipate criminal occurrences, which contributed to the reinforcement of public security. Finally, the models were implemented on a platform that will provide an API to enable other entities to make requests for predictions in real-time. An application will also be presented where it is possible to show criminal predictions visually.Keywords: Crime prediction, machine learning, public safety, smart city.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 132419 Emotion Classification for Students with Autism in Mathematics E-learning using Physiological and Facial Expression Measures
Authors: Hui-Chuan Chu, Min-Ju Liao, Wei-Kai Cheng, William Wei-Jen Tsai, Yuh-Min Chen
Abstract:
Avoiding learning failures in mathematics e-learning environments caused by emotional problems in students with autism has become an important topic for combining of special education with information and communications technology. This study presents an adaptive emotional adjustment model in mathematics e-learning for students with autism, emphasizing the lack of emotional perception in mathematics e-learning systems. In addition, an emotion classification for students with autism was developed by inducing emotions in mathematical learning environments to record changes in the physiological signals and facial expressions of students. Using these methods, 58 emotional features were obtained. These features were then processed using one-way ANOVA and information gain (IG). After reducing the feature dimension, methods of support vector machines (SVM), k-nearest neighbors (KNN), and classification and regression trees (CART) were used to classify four emotional categories: baseline, happy, angry, and anxious. After testing and comparisons, in a situation without feature selection, the accuracy rate of the SVM classification can reach as high as 79.3-%. After using IG to reduce the feature dimension, with only 28 features remaining, SVM still has a classification accuracy of 78.2-%. The results of this research could enhance the effectiveness of eLearning in special education.
Keywords: Emotion classification, Physiological and facial Expression measures, Students with autism, Mathematics e-learning.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 178118 Predictive Analytics of Student Performance Determinants in Education
Authors: Mahtab Davari, Charles Edward Okon, Somayeh Aghanavesi
Abstract:
Every institute of learning is usually interested in the performance of enrolled students. The level of these performances determines the approach an institute of study may adopt in rendering academic services. The focus of this paper is to evaluate students' academic performance in given courses of study using machine learning methods. This study evaluated various supervised machine learning classification algorithms such as Logistic Regression (LR), Support Vector Machine (SVM), Random Forest, Decision Tree, K-Nearest Neighbors, Linear Discriminant Analysis (LDA), and Quadratic Discriminant Analysis, using selected features to predict study performance. The accuracy, precision, recall, and F1 score obtained from a 5-Fold Cross-Validation were used to determine the best classification algorithm to predict students’ performances. SVM (using a linear kernel), LDA, and LR were identified as the best-performing machine learning methods. Also, using the LR model, this study identified students' educational habits such as reading and paying attention in class as strong determinants for a student to have an above-average performance. Other important features include the academic history of the student and work. Demographic factors such as age, gender, high school graduation, etc., had no significant effect on a student's performance.
Keywords: Student performance, supervised machine learning, prediction, classification, cross-validation.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 54817 Active Segment Selection Method in EEG Classification Using Fractal Features
Authors: Samira Vafaye Eslahi
Abstract:
BCI (Brain Computer Interface) is a communication machine that translates brain massages to computer commands. These machines with the help of computer programs can recognize the tasks that are imagined. Feature extraction is an important stage of the process in EEG classification that can effect in accuracy and the computation time of processing the signals. In this study we process the signal in three steps of active segment selection, fractal feature extraction, and classification. One of the great challenges in BCI applications is to improve classification accuracy and computation time together. In this paper, we have used student’s 2D sample t-statistics on continuous wavelet transforms for active segment selection to reduce the computation time. In the next level, the features are extracted from some famous fractal dimension estimation of the signal. These fractal features are Katz and Higuchi. In the classification stage we used ANFIS (Adaptive Neuro-Fuzzy Inference System) classifier, FKNN (Fuzzy K-Nearest Neighbors), LDA (Linear Discriminate Analysis), and SVM (Support Vector Machines). We resulted that active segment selection method would reduce the computation time and Fractal dimension features with ANFIS analysis on selected active segments is the best among investigated methods in EEG classification.
Keywords: EEG, Student’s t- statistics, BCI, Fractal Features, ANFIS, FKNN.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 212016 A New Fast Skin Color Detection Technique
Authors: Tarek M. Mahmoud
Abstract:
Skin color can provide a useful and robust cue for human-related image analysis, such as face detection, pornographic image filtering, hand detection and tracking, people retrieval in databases and Internet, etc. The major problem of such kinds of skin color detection algorithms is that it is time consuming and hence cannot be applied to a real time system. To overcome this problem, we introduce a new fast technique for skin detection which can be applied in a real time system. In this technique, instead of testing each image pixel to label it as skin or non-skin (as in classic techniques), we skip a set of pixels. The reason of the skipping process is the high probability that neighbors of the skin color pixels are also skin pixels, especially in adult images and vise versa. The proposed method can rapidly detect skin and non-skin color pixels, which in turn dramatically reduce the CPU time required for the protection process. Since many fast detection techniques are based on image resizing, we apply our proposed pixel skipping technique with image resizing to obtain better results. The performance evaluation of the proposed skipping and hybrid techniques in terms of the measured CPU time is presented. Experimental results demonstrate that the proposed methods achieve better result than the relevant classic method.Keywords: Adult images filtering, image resizing, skin color detection, YcbCr color space.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 400315 Non-Overlapping Hierarchical Index Structure for Similarity Search
Authors: Mounira Taileb, Sid Lamrous, Sami Touati
Abstract:
In order to accelerate the similarity search in highdimensional database, we propose a new hierarchical indexing method. It is composed of offline and online phases. Our contribution concerns both phases. In the offline phase, after gathering the whole of the data in clusters and constructing a hierarchical index, the main originality of our contribution consists to develop a method to construct bounding forms of clusters to avoid overlapping. For the online phase, our idea improves considerably performances of similarity search. However, for this second phase, we have also developed an adapted search algorithm. Our method baptized NOHIS (Non-Overlapping Hierarchical Index Structure) use the Principal Direction Divisive Partitioning (PDDP) as algorithm of clustering. The principle of the PDDP is to divide data recursively into two sub-clusters; division is done by using the hyper-plane orthogonal to the principal direction derived from the covariance matrix and passing through the centroid of the cluster to divide. Data of each two sub-clusters obtained are including by a minimum bounding rectangle (MBR). The two MBRs are directed according to the principal direction. Consequently, the nonoverlapping between the two forms is assured. Experiments use databases containing image descriptors. Results show that the proposed method outperforms sequential scan and SRtree in processing k-nearest neighbors.
Keywords: K-nearest neighbour search, multi-dimensional indexing, multimedia databases, similarity search.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 156114 Distances over Incomplete Diabetes and Breast Cancer Data Based on Bhattacharyya Distance
Authors: Loai AbdAllah, Mahmoud Kaiyal
Abstract:
Missing values in real-world datasets are a common problem. Many algorithms were developed to deal with this problem, most of them replace the missing values with a fixed value that was computed based on the observed values. In our work, we used a distance function based on Bhattacharyya distance to measure the distance between objects with missing values. Bhattacharyya distance, which measures the similarity of two probability distributions. The proposed distance distinguishes between known and unknown values. Where the distance between two known values is the Mahalanobis distance. When, on the other hand, one of them is missing the distance is computed based on the distribution of the known values, for the coordinate that contains the missing value. This method was integrated with Wikaya, a digital health company developing a platform that helps to improve prevention of chronic diseases such as diabetes and cancer. In order for Wikaya’s recommendation system to work distance between users need to be measured. Since there are missing values in the collected data, there is a need to develop a distance function distances between incomplete users profiles. To evaluate the accuracy of the proposed distance function in reflecting the actual similarity between different objects, when some of them contain missing values, we integrated it within the framework of k nearest neighbors (kNN) classifier, since its computation is based only on the similarity between objects. To validate this, we ran the algorithm over diabetes and breast cancer datasets, standard benchmark datasets from the UCI repository. Our experiments show that kNN classifier using our proposed distance function outperforms the kNN using other existing methods.Keywords: Missing values, distance metric, Bhattacharyya distance.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 78113 Relevance Feedback within CBIR Systems
Authors: Mawloud Mosbah, Bachir Boucheham
Abstract:
We present here the results for a comparative study of some techniques, available in the literature, related to the relevance feedback mechanism in the case of a short-term learning. Only one method among those considered here is belonging to the data mining field which is the K-nearest neighbors algorithm (KNN) while the rest of the methods is related purely to the information retrieval field and they fall under the purview of the following three major axes: Shifting query, Feature Weighting and the optimization of the parameters of similarity metric. As a contribution, and in addition to the comparative purpose, we propose a new version of the KNN algorithm referred to as an incremental KNN which is distinct from the original version in the sense that besides the influence of the seeds, the rate of the actual target image is influenced also by the images already rated. The results presented here have been obtained after experiments conducted on the Wang database for one iteration and utilizing color moments on the RGB space. This compact descriptor, Color Moments, is adequate for the efficiency purposes needed in the case of interactive systems. The results obtained allow us to claim that the proposed algorithm proves good results; it even outperforms a wide range of techniques available in the literature.
Keywords: CBIR, Category Search, Relevance Feedback (RFB), Query Point Movement, Standard Rocchio’s Formula, Adaptive Shifting Query, Feature Weighting, Optimization of the Parameters of Similarity Metric, Original KNN, Incremental KNN.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 234212 Family Communication Patterns between Muslim and Santal Communities in Rural Bangladesh: A Cross-Cultural Perspective
Authors: Md. Emaj Uddin
Abstract:
This study compares family communication patterns in association with family socio-cultural status, especially marriage and family pattern, and couples- socio-economic status between Muslim and Santal communities in rural Bangladesh. A total of 288 couples, 145 couples from the Muslim and 143 couples from the Santal were randomly selected through cluster sampling procedure from Kalna village situated in Tanore Upazila of Rajshahi district of Bangladesh, where both the communities dwell as neighbors. In order to collect data from the selected samples, interview method with semistructural questionnaire schedule was applied. The responses given by the respondents were analyzed by Pearson-s chi-squire test and bivariate correlation techniques. The results of Pearson-s chi-squire test revealed that family communication patterns (X2= 25. 90, df= 2, p<0.01, p>0.05) were significantly different between the Muslim and Santal communities. In addition, Spearman-s bivariate correlation coefficients suggested that among the exogenous factors, family type (rs=.135, p<0.05) and occupation of both husband (rs= .197, p<0.01) and wife (rs= .265, p<0.01) were significantly positive associations, and marital arrangement (rs= -.177, p<0.01), education of husband (rs= -.108, p<0.05) and wife (rs= -.142, p<0.01 & p<0.05), and family income (rs= -.164, p<0.01) were significantly negative relations with the family communication patterns followed between the two communities, although age difference between husband and wife, family head and residence patterns were not significant relations with ones.
Keywords: Bangladesh, Cross-Cultural Comparison, Family Communication Patterns, Family Socio-Cultural Status, Muslim, Santal.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 257511 Classification of Potential Biomarkers in Breast Cancer Using Artificial Intelligence Algorithms and Anthropometric Datasets
Authors: Aref Aasi, Sahar Ebrahimi Bajgani, Erfan Aasi
Abstract:
Breast cancer (BC) continues to be the most frequent cancer in females and causes the highest number of cancer-related deaths in women worldwide. Inspired by recent advances in studying the relationship between different patient attributes and features and the disease, in this paper, we have tried to investigate the different classification methods for better diagnosis of BC in the early stages. In this regard, datasets from the University Hospital Centre of Coimbra were chosen, and different machine learning (ML)-based and neural network (NN) classifiers have been studied. For this purpose, we have selected favorable features among the nine provided attributes from the clinical dataset by using a random forest algorithm. This dataset consists of both healthy controls and BC patients, and it was noted that glucose, BMI, resistin, and age have the most importance, respectively. Moreover, we have analyzed these features with various ML-based classifier methods, including Decision Tree (DT), K-Nearest Neighbors (KNN), eXtreme Gradient Boosting (XGBoost), Logistic Regression (LR), Naive Bayes (NB), and Support Vector Machine (SVM) along with NN-based Multi-Layer Perceptron (MLP) classifier. The results revealed that among different techniques, the SVM and MLP classifiers have the most accuracy, with amounts of 96% and 92%, respectively. These results divulged that the adopted procedure could be used effectively for the classification of cancer cells, and also it encourages further experimental investigations with more collected data for other types of cancers.
Keywords: Breast cancer, health diagnosis, Machine Learning, biomarker classification, Neural Network.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 31910 Visualization and Indexing of Spectral Databases
Authors: Tibor Kulcsar, Gabor Sarossy, Gabor Bereznai, Robert Auer, Janos Abonyi
Abstract:
On-line (near infrared) spectroscopy is widely used to support the operation of complex process systems. Information extracted from spectral database can be used to estimate unmeasured product properties and monitor the operation of the process. These techniques are based on looking for similar spectra by nearest neighborhood algorithms and distance based searching methods. Search for nearest neighbors in the spectral space is an NP-hard problem, the computational complexity increases by the number of points in the discrete spectrum and the number of samples in the database. To reduce the calculation time some kind of indexing could be used. The main idea presented in this paper is to combine indexing and visualization techniques to reduce the computational requirement of estimation algorithms by providing a two dimensional indexing that can also be used to visualize the structure of the spectral database. This 2D visualization of spectral database does not only support application of distance and similarity based techniques but enables the utilization of advanced clustering and prediction algorithms based on the Delaunay tessellation of the mapped spectral space. This means the prediction has not to use the high dimension space but can be based on the mapped space too. The results illustrate that the proposed method is able to segment (cluster) spectral databases and detect outliers that are not suitable for instance based learning algorithms.
Keywords: indexing high dimensional databases, dimensional reduction, clustering, similarity, k-nn algorithm.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 17699 Factors Associated with Mammography Screening Behaviors: A Cross-Sectional Descriptive Study of Egyptian Women
Authors: Salwa Hagag Abdelaziz, Naglaa Fathy Youssef, Nadia Abdel Latif Hassan, Rasha Wesam Abdel Rahman
Abstract:
Breast cancer is considered as a substantial health concern and practicing mammography screening [MS] is important in minimizing its related morbidity. So it is essential to have a better understanding of breast cancer screening behaviors of women and factors that influence utilization of them. The aim of this study is to identify the factors that are linked to MS behaviors among the Egyptian women. A cross-sectional descriptive design was carried out to provide a snapshot of the factors that are linked to MS behaviors. A convenience sample of 311 women was utilized and all eligible participants admitted to the Women Imaging Unit who are 40 years of age or above, coming for mammography assessment, not pregnant or breast feeding and who accepted to participate in the study were included. A structured questionnaire was developed by the researchers and contains three parts; Socio-demographic data; Motivating factors associated with MS; and association between MS and model of behavior change. The analyzed data indicated that most of the participated women (66.6%) belonged to the age group of 40- 49.A high proportion of participants (58.1%) of group having previous MS influenced by their neighbors to practice MS, whereas 32.7 % in group not having previous MS were influenced by family members which indicated significant differences (P <0.05). Doctors and media shown to be the least influence of others to practice MS. Women with intention to have a future mammogram had higher OR (1.404) for practicing MS compared with women with no intention. Further studies are needed to examine the relation between Transtheoretical Model [TTM] and practicing MS.
Keywords: Breast cancer, mammography, screening behaviors.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 21488 Automatic Staging and Subtype Determination for Non-Small Cell Lung Carcinoma Using PET Image Texture Analysis
Authors: Seyhan Karaçavuş, Bülent Yılmaz, Ömer Kayaaltı, Semra İçer, Arzu Taşdemir, Oğuzhan Ayyıldız, Kübra Eset, Eser Kaya
Abstract:
In this study, our goal was to perform tumor staging and subtype determination automatically using different texture analysis approaches for a very common cancer type, i.e., non-small cell lung carcinoma (NSCLC). Especially, we introduced a texture analysis approach, called Law’s texture filter, to be used in this context for the first time. The 18F-FDG PET images of 42 patients with NSCLC were evaluated. The number of patients for each tumor stage, i.e., I-II, III or IV, was 14. The patients had ~45% adenocarcinoma (ADC) and ~55% squamous cell carcinoma (SqCCs). MATLAB technical computing language was employed in the extraction of 51 features by using first order statistics (FOS), gray-level co-occurrence matrix (GLCM), gray-level run-length matrix (GLRLM), and Laws’ texture filters. The feature selection method employed was the sequential forward selection (SFS). Selected textural features were used in the automatic classification by k-nearest neighbors (k-NN) and support vector machines (SVM). In the automatic classification of tumor stage, the accuracy was approximately 59.5% with k-NN classifier (k=3) and 69% with SVM (with one versus one paradigm), using 5 features. In the automatic classification of tumor subtype, the accuracy was around 92.7% with SVM one vs. one. Texture analysis of FDG-PET images might be used, in addition to metabolic parameters as an objective tool to assess tumor histopathological characteristics and in automatic classification of tumor stage and subtype.Keywords: Cancer stage, cancer cell type, non-small cell lung carcinoma, PET, texture analysis.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 9767 Relevant Stakeholders in Environmental Management Organization: The Case of Industries Três Rios/RJ
Authors: Beatriz dos Anjos Furtado, Marina Barreiros Lamim, Camila Avozani Zago, Julianne Alvim Milward-de-Azevedo, Luís Cláudio Meirelles de Medeiros
Abstract:
The intense process of economic acceleration, expansion of industrial activities and capitalism, combined with population growth, while promoting the development, bring environmental consequences and dynamics of locations. It can be seen that society is seeking to break with old paradigms of capitalist society, seeking to reconcile growth with sustainable development, with a change of mentality of the stakeholders of the production process (shareholders, employees, suppliers, customers, governments, and neighbors, groups citizens and the public in general). In this context, this research aims to map the stakeholders interested in environmental management in industries located in the city of Três Rios/RJ. The city of Três Rios is located in South-Central region of the state of Rio de Janeiro - Brazil. Methodological resources used refer to descriptive and field research, whose nature is qualitative and quantitative. It is also of multicases studies in the study area, and the data collection occurred by means of semi-structured questionnaires and interviews with employees related to the environmental area of the industries located in Três Rios and registered at the Federation of Industries the State of Rio de Janeiro - FIRJAN in the version of 2013 and active in federal revenue. Through this research it observed, among other things, the stakeholders involved in the environmental management process of “Três Rios” industry respondents, and those responding to the demands of environmental management.
Keywords: Environmental management, environmental practices, industry, stakeholders.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 14916 A Spatial Hypergraph Based Semi-Supervised Band Selection Method for Hyperspectral Imagery Semantic Interpretation
Authors: Akrem Sellami, Imed Riadh Farah
Abstract:
Hyperspectral imagery (HSI) typically provides a wealth of information captured in a wide range of the electromagnetic spectrum for each pixel in the image. Hence, a pixel in HSI is a high-dimensional vector of intensities with a large spectral range and a high spectral resolution. Therefore, the semantic interpretation is a challenging task of HSI analysis. We focused in this paper on object classification as HSI semantic interpretation. However, HSI classification still faces some issues, among which are the following: The spatial variability of spectral signatures, the high number of spectral bands, and the high cost of true sample labeling. Therefore, the high number of spectral bands and the low number of training samples pose the problem of the curse of dimensionality. In order to resolve this problem, we propose to introduce the process of dimensionality reduction trying to improve the classification of HSI. The presented approach is a semi-supervised band selection method based on spatial hypergraph embedding model to represent higher order relationships with different weights of the spatial neighbors corresponding to the centroid of pixel. This semi-supervised band selection has been developed to select useful bands for object classification. The presented approach is evaluated on AVIRIS and ROSIS HSIs and compared to other dimensionality reduction methods. The experimental results demonstrate the efficacy of our approach compared to many existing dimensionality reduction methods for HSI classification.Keywords: Hyperspectral image, spatial hypergraph, dimensionality reduction, semantic interpretation, band selection, feature extraction.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1220