Search results for: clustering ensemble
341 Patented Free-Space Optical System for Auto Aligned Optical Beam Allowing to Compensate Mechanical Misalignments
Authors: Aurelien Boutin
Abstract:
In optical systems such as Variable Optical Delay Lines, where a collimated beam has to go back and forth, corner cubes are used in order to keep the reflected beam parallel to the incoming beam. However, the reflected beam can be laterally shifted, which will lead to losses. In this paper, we report on a patented optical design that allows keeping the reflected beam with the exact same position and direction whatever the displacement of the corner cube leading to zero losses. After explaining how the optical design works and theoretically allows to compensate for any defects in the translation of the corner cube, we will present the results of experimental comparisons between a standard layout (i.e., only corner cubes) and our optical layout. To compare both optical layouts, we used a fiber-to-fiber coupling setup. It consists of a couple of lights from one fiber to the other, thanks to two lenses. The ensemble [fiber+lense] is fixed and called a collimator so that the light is coupled from one collimator to another. Each collimator was precisely made in order to have a precise working distance. In the experiment, we measured and compared the Insertion Losses (IL) variations between both collimators with the distance between them (i.e., natural Gaussian beam coupling losses) and between both collimators in the different optical layouts tested, with the same optical length propagation. We will show that the IL variations of our setup are less than 0.05dB with respect to the IL variations of collimators alone.Keywords: free-space optics, variable optical delay lines, optical cavity, auto-alignment
Procedia PDF Downloads 100340 Design of Personal Job Recommendation Framework on Smartphone Platform
Authors: Chayaporn Kaensar
Abstract:
Recently, Job Recommender Systems have gained much attention in industries since they solve the problem of information overload on the recruiting website. Therefore, we proposed Extended Personalized Job System that has the capability of providing the appropriate jobs for job seeker and recommending some suitable information for them using Data Mining Techniques and Dynamic User Profile. On the other hands, company can also interact to the system for publishing and updating job information. This system have emerged and supported various platforms such as web application and android mobile application. In this paper, User profiles, Implicit User Action, User Feedback, and Clustering Techniques in WEKA libraries have gained attention and implemented for this application. In additions, open source tools like Yii Web Application Framework, Bootstrap Front End Framework and Android Mobile Technology were also applied.Keywords: recommendation, user profile, data mining, web and mobile technology
Procedia PDF Downloads 313339 Music for Peace, a Model for Socialization
Authors: Mina Fenercioglu
Abstract:
This study discusses a Turkish music education model similar to El Sistema. The Music for Peace (Baris icin Muzik) program, founded in 2005 by an idealist humanitarian in Istanbul, started as a pilot project with accordion and then with flute in ensembles at the Ulubatlı Hasan Primary School where mostly underprivileged children attend. The program gives complimentary music lessons particularly to deprived children, who at the beginning were prone to crime. With music education, the attitudes of the children turn to a positive aspect. The aim of this initiative provides social and cultural awareness, which serves the same mission as the world known El Sistema. In 2009, the Music for Peace project received Deutsche Bank Urban Age Award, which is a prize presented to enterprises that improve the quality of life in urban environment. Since 2010, the Music for Peace continues the symphonic music education at its own place. In 2011, Music for Peace gained foundation status, and started to accept donations as musical instruments for children who attend the courses. On July 2013, IKSV (Istanbul Culture and Arts Foundation) became the institutional partner of Music for Peace Foundation and in June 2014, the foundation signed up to join El Sistema’s global program. Now in 2015, the foundation has three ensembles: the Music for Peace Orchestra, which consists of two orchestras practicing and performing in different levels; the Music for Peace Chorus, which has joined Istanbul International Polyphonic Choruses Festival; and the recently established Music for Peace Brass Ensemble.Keywords: El Sistema, music education, music for peace, socialization
Procedia PDF Downloads 421338 Extracting Actions with Improved Part of Speech Tagging for Social Networking Texts
Authors: Yassine Jamoussi, Ameni Youssfi, Henda Ben Ghezala
Abstract:
With the growing interest in social networking, the interaction of social actors evolved to a source of knowledge in which it becomes possible to perform context aware-reasoning. The information extraction from social networking especially Twitter and Facebook is one of the problems in this area. To extract text from social networking, we need several lexical features and large scale word clustering. We attempt to expand existing tokenizer and to develop our own tagger in order to support the incorrect words currently in existence in Facebook and Twitter. Our goal in this work is to benefit from the lexical features developed for Twitter and online conversational text in previous works, and to develop an extraction model for constructing a huge knowledge based on actionsKeywords: social networking, information extraction, part-of-speech tagging, natural language processing
Procedia PDF Downloads 305337 Authentication Based on Hand Movement by Low Dimensional Space Representation
Authors: Reut Lanyado, David Mendlovic
Abstract:
Most biological methods for authentication require special equipment and, some of them are easy to fake. We proposed a method for authentication based on hand movement while typing a sentence with a regular camera. This technique uses the full video of the hand, which is harder to fake. In the first phase, we tracked the hand joints in each frame. Next, we represented a single frame for each individual using our Pose Agnostic Rotation and Movement (PARM) dimensional space. Then, we indicated a full video of hand movement in a fixed low dimensional space using this method: Fixed Dimension Video by Interpolation Statistics (FDVIS). Finally, we identified each individual in the FDVIS representation using unsupervised clustering and supervised methods. Accuracy exceeds 96% for 80 individuals by using supervised KNN.Keywords: authentication, feature extraction, hand recognition, security, signal processing
Procedia PDF Downloads 127336 Uncertainty in Near-Term Global Surface Warming Linked to Pacific Trade Wind Variability
Authors: M. Hadi Bordbar, Matthew England, Alex Sen Gupta, Agus Santoso, Andrea Taschetto, Thomas Martin, Wonsun Park, Mojib Latif
Abstract:
Climate models generally simulate long-term reductions in the Pacific Walker Circulation with increasing atmospheric greenhouse gases. However, over two recent decades (1992-2011) there was a strong intensification of the Pacific Trade Winds that is linked with a slowdown in global surface warming. Using large ensembles of multiple climate models forced by increasing atmospheric greenhouse gas concentrations and starting from different ocean and/or atmospheric initial conditions, we reveal very diverse 20-year trends in the tropical Pacific climate associated with a considerable uncertainty in the globally averaged surface air temperature (SAT) in each model ensemble. This result suggests low confidence in our ability to accurately predict SAT trends over 20-year timescale only from external forcing. We show, however, that the uncertainty can be reduced when the initial oceanic state is adequately known and well represented in the model. Our analyses suggest that internal variability in the Pacific trade winds can mask the anthropogenic signal over a 20-year time frame, and drive transitions between periods of accelerated global warming and temporary slowdown periods.Keywords: trade winds, walker circulation, hiatus in the global surface warming, internal climate variability
Procedia PDF Downloads 268335 Capacitated Multiple Allocation P-Hub Median Problem on a Cluster Based Network under Congestion
Authors: Çağrı Özgün Kibiroğlu, Zeynep Turgut
Abstract:
This paper considers a hub location problem where the network service area partitioned into predetermined zones (represented by node clusters is given) and potential hub nodes capacity levels are determined a priori as a selection criteria of hub to investigate congestion effect on network. The objective is to design hub network by determining all required hub locations in the node clusters and also allocate non-hub nodes to hubs such that the total cost including transportation cost, opening cost of hubs and penalty cost for exceed of capacity level at hubs is minimized. A mixed integer linear programming model is developed introducing additional constraints to the traditional model of capacitated multiple allocation hub location problem and empirically tested.Keywords: hub location problem, p-hub median problem, clustering, congestion
Procedia PDF Downloads 492334 Machine Learning and Deep Learning Approach for People Recognition and Tracking in Crowd for Safety Monitoring
Authors: A. Degale Desta, Cheng Jian
Abstract:
Deep learning application in computer vision is rapidly advancing, giving it the ability to monitor the public and quickly identify potentially anomalous behaviour from crowd scenes. Therefore, the purpose of the current work is to improve the performance of safety of people in crowd events from panic behaviour through introducing the innovative idea of Aggregation of Ensembles (AOE), which makes use of the pre-trained ConvNets and a pool of classifiers to find anomalies in video data with packed scenes. According to the theory of algorithms that applied K-means, KNN, CNN, SVD, and Faster-CNN, YOLOv5 architectures learn different levels of semantic representation from crowd videos; the proposed approach leverages an ensemble of various fine-tuned convolutional neural networks (CNN), allowing for the extraction of enriched feature sets. In addition to the above algorithms, a long short-term memory neural network to forecast future feature values and a handmade feature that takes into consideration the peculiarities of the crowd to understand human behavior. On well-known datasets of panic situations, experiments are run to assess the effectiveness and precision of the suggested method. Results reveal that, compared to state-of-the-art methodologies, the system produces better and more promising results in terms of accuracy and processing speed.Keywords: action recognition, computer vision, crowd detecting and tracking, deep learning
Procedia PDF Downloads 161333 Diagnose of the Future of Family Businesses Based on the Study of Spanish Family Businesses Founders
Authors: Fernando Doral
Abstract:
Family businesses are a key phenomenon within the business landscape. Nevertheless, it involves two terms (“family” and “business”) which are nowadays rapidly evolving. Consequently, it isn't easy to diagnose if a family business will be a growing or decreasing phenomenon, which is the objective of this study. For that purpose, a sample of 50 Spanish-established companies from various sectors was taken. Different factors were identified for each enterprise, related to the profile of the founders, such as age, the number of sons and daughters, or support received from the family at the moment to start it up. That information was taken as an input for a clustering method to identify groups, which could help define the founders' profiles. That characterization was carried as a base to identify three factors whose evolution should be analyzed: family structures, business landscape and entrepreneurs' motivations. The analysis of the evolution of these three factors seems to indicate a negative tendency of family businesses. Therefore the consequent diagnosis of this study is to consider family businesses as a declining phenomenon.Keywords: business diagnose, business trends, family business, family business founders
Procedia PDF Downloads 207332 Data Mining Techniques for Anti-Money Laundering
Authors: M. Sai Veerendra
Abstract:
Today, money laundering (ML) poses a serious threat not only to financial institutions but also to the nation. This criminal activity is becoming more and more sophisticated and seems to have moved from the cliché of drug trafficking to financing terrorism and surely not forgetting personal gain. Most of the financial institutions internationally have been implementing anti-money laundering solutions (AML) to fight investment fraud activities. However, traditional investigative techniques consume numerous man-hours. Recently, data mining approaches have been developed and are considered as well-suited techniques for detecting ML activities. Within the scope of a collaboration project on developing a new data mining solution for AML Units in an international investment bank in Ireland, we survey recent data mining approaches for AML. In this paper, we present not only these approaches but also give an overview on the important factors in building data mining solutions for AML activities.Keywords: data mining, clustering, money laundering, anti-money laundering solutions
Procedia PDF Downloads 537331 Simulation Approach for a Comparison of Linked Cluster Algorithm and Clusterhead Size Algorithm in Ad Hoc Networks
Authors: Ameen Jameel Alawneh
Abstract:
A Mobile ad-hoc network (MANET) is a collection of wireless mobile hosts that dynamically form a temporary network without the aid of a system administrator. It has neither fixed infrastructure nor wireless ad hoc sessions. It inherently reaches several nodes with a single transmission, and each node functions as both a host and a router. The network maybe represented as a set of clusters each managed by clusterhead. The cluster size is not fixed and it depends on the movement of nodes. We proposed a clusterhead size algorithm (CHSize). This clustering algorithm can be used by several routing algorithms for ad hoc networks. An elected clusterhead is assigned for communication with all other clusters. Analysis and simulation of the algorithm has been implemented using GloMoSim networks simulator, MATLAB and MAPL11 proved that the proposed algorithm achieves the goals.Keywords: simulation, MANET, Ad-hoc, cluster head size, linked cluster algorithm, loss and dropped packets
Procedia PDF Downloads 391330 The Data Quality Model for the IoT based Real-time Water Quality Monitoring Sensors
Authors: Rabbia Idrees, Ananda Maiti, Saurabh Garg, Muhammad Bilal Amin
Abstract:
IoT devices are the basic building blocks of IoT network that generate enormous volume of real-time and high-speed data to help organizations and companies to take intelligent decisions. To integrate this enormous data from multisource and transfer it to the appropriate client is the fundamental of IoT development. The handling of this huge quantity of devices along with the huge volume of data is very challenging. The IoT devices are battery-powered and resource-constrained and to provide energy efficient communication, these IoT devices go sleep or online/wakeup periodically and a-periodically depending on the traffic loads to reduce energy consumption. Sometime these devices get disconnected due to device battery depletion. If the node is not available in the network, then the IoT network provides incomplete, missing, and inaccurate data. Moreover, many IoT applications, like vehicle tracking and patient tracking require the IoT devices to be mobile. Due to this mobility, If the distance of the device from the sink node become greater than required, the connection is lost. Due to this disconnection other devices join the network for replacing the broken-down and left devices. This make IoT devices dynamic in nature which brings uncertainty and unreliability in the IoT network and hence produce bad quality of data. Due to this dynamic nature of IoT devices we do not know the actual reason of abnormal data. If data are of poor-quality decisions are likely to be unsound. It is highly important to process data and estimate data quality before bringing it to use in IoT applications. In the past many researchers tried to estimate data quality and provided several Machine Learning (ML), stochastic and statistical methods to perform analysis on stored data in the data processing layer, without focusing the challenges and issues arises from the dynamic nature of IoT devices and how it is impacting data quality. A comprehensive review on determining the impact of dynamic nature of IoT devices on data quality is done in this research and presented a data quality model that can deal with this challenge and produce good quality of data. This research presents the data quality model for the sensors monitoring water quality. DBSCAN clustering and weather sensors are used in this research to make data quality model for the sensors monitoring water quality. An extensive study has been done in this research on finding the relationship between the data of weather sensors and sensors monitoring water quality of the lakes and beaches. The detailed theoretical analysis has been presented in this research mentioning correlation between independent data streams of the two sets of sensors. With the help of the analysis and DBSCAN, a data quality model is prepared. This model encompasses five dimensions of data quality: outliers’ detection and removal, completeness, patterns of missing values and checks the accuracy of the data with the help of cluster’s position. At the end, the statistical analysis has been done on the clusters formed as the result of DBSCAN, and consistency is evaluated through Coefficient of Variation (CoV).Keywords: clustering, data quality, DBSCAN, and Internet of things (IoT)
Procedia PDF Downloads 139329 Performance Prediction Methodology of Slow Aging Assets
Authors: M. Ben Slimene, M.-S. Ouali
Abstract:
Asset management of urban infrastructures faces a multitude of challenges that need to be overcome to obtain a reliable measurement of performances. Predicting the performance of slowly aging systems is one of those challenges, which helps the asset manager to investigate specific failure modes and to undertake the appropriate maintenance and rehabilitation interventions to avoid catastrophic failures as well as to optimize the maintenance costs. This article presents a methodology for modeling the deterioration of slowly degrading assets based on an operating history. It consists of extracting degradation profiles by grouping together assets that exhibit similar degradation sequences using an unsupervised classification technique derived from artificial intelligence. The obtained clusters are used to build the performance prediction models. This methodology is applied to a sample of a stormwater drainage culvert dataset.Keywords: artificial Intelligence, clustering, culvert, regression model, slow degradation
Procedia PDF Downloads 111328 Unsupervised Learning with Self-Organizing Maps for Named Entity Recognition in the CONLL2003 Dataset
Authors: Assel Jaxylykova, Alexnder Pak
Abstract:
This study utilized a Self-Organizing Map (SOM) for unsupervised learning on the CONLL-2003 dataset for Named Entity Recognition (NER). The process involved encoding words into 300-dimensional vectors using FastText. These vectors were input into a SOM grid, where training adjusted node weights to minimize distances. The SOM provided a topological representation for identifying and clustering named entities, demonstrating its efficacy without labeled examples. Results showed an F1-measure of 0.86, highlighting SOM's viability. Although some methods achieve higher F1 measures, SOM eliminates the need for labeled data, offering a scalable and efficient alternative. The SOM's ability to uncover hidden patterns provides insights that could enhance existing supervised methods. Further investigation into potential limitations and optimization strategies is suggested to maximize benefits.Keywords: named entity recognition, natural language processing, self-organizing map, CONLL-2003, semantics
Procedia PDF Downloads 45327 Exploring the Nature and Meaning of Theory in the Field of Neuroeducation Studies
Authors: Ali Nouri
Abstract:
Neuroeducation is one of the most exciting research fields which is continually evolving. However, there is a need to develop its theoretical bases in connection to practice. The present paper is a starting attempt in this regard to provide a space from which to think about neuroeducational theory and invoke more investigation in this area. Accordingly, a comprehensive theory of neuroeducation could be defined as grouping or clustering of concepts and propositions that describe and explain the nature of human learning to provide valid interpretations and implications useful for educational practice in relation to philosophical aspects or values. Whereas it should be originated from the philosophical foundations of the field and explain its normative significance, it needs to be testable in terms of rigorous evidence to fundamentally advance contemporary educational policy and practice. There is thus pragmatically a need to include a course on neuroeducational theory into the curriculum of the field. In addition, there is a need to articulate and disseminate considerable discussion over the subject within professional journals and academic societies.Keywords: neuroeducation studies, neuroeducational theory, theory building, neuroeducation research
Procedia PDF Downloads 447326 MIMIC: A Multi Input Micro-Influencers Classifier
Authors: Simone Leonardi, Luca Ardito
Abstract:
Micro-influencers are effective elements in the marketing strategies of companies and institutions because of their capability to create an hyper-engaged audience around a specific topic of interest. In recent years, many scientific approaches and commercial tools have handled the task of detecting this type of social media users. These strategies adopt solutions ranging from rule based machine learning models to deep neural networks and graph analysis on text, images, and account information. This work compares the existing solutions and proposes an ensemble method to generalize them with different input data and social media platforms. The deployed solution combines deep learning models on unstructured data with statistical machine learning models on structured data. We retrieve both social media accounts information and multimedia posts on Twitter and Instagram. These data are mapped into feature vectors for an eXtreme Gradient Boosting (XGBoost) classifier. Sixty different topics have been analyzed to build a rule based gold standard dataset and to compare the performances of our approach against baseline classifiers. We prove the effectiveness of our work by comparing the accuracy, precision, recall, and f1 score of our model with different configurations and architectures. We obtained an accuracy of 0.91 with our best performing model.Keywords: deep learning, gradient boosting, image processing, micro-influencers, NLP, social media
Procedia PDF Downloads 183325 A Literature Review on the Role of Local Potential for Creative Industries
Authors: Maya Irjayanti
Abstract:
Local creativity utilization has been a strategic investment to be expanded as a creative industry due to its significant contribution to the national gross domestic product. Many developed and developing countries look toward creative industries as an agenda for the economic growth. This study aims to identify the role of local potential for creative industries from various empirical studies. The method performed in this study will involve a peer-reviewed journal articles and conference papers review addressing local potential and creative industries. The literature review analysis will include several steps: material collection, descriptive analysis, category selection, and material evaluation. Finally, the outcome expected provides a creative industries clustering based on the local potential of various nations. In addition, the finding of this study will be used as future research reference to explore a particular area with well-known aspects of local potential for creative industry products.Keywords: business, creativity, local potential, local wisdom
Procedia PDF Downloads 385324 Fast Short-Term Electrical Load Forecasting under High Meteorological Variability with a Multiple Equation Time Series Approach
Authors: Charline David, Alexandre Blondin Massé, Arnaud Zinflou
Abstract:
In 2016, Clements, Hurn, and Li proposed a multiple equation time series approach for the short-term load forecasting, reporting an average mean absolute percentage error (MAPE) of 1.36% on an 11-years dataset for the Queensland region in Australia. We present an adaptation of their model to the electrical power load consumption for the whole Quebec province in Canada. More precisely, we take into account two additional meteorological variables — cloudiness and wind speed — on top of temperature, as well as the use of multiple meteorological measurements taken at different locations on the territory. We also consider other minor improvements. Our final model shows an average MAPE score of 1:79% over an 8-years dataset.Keywords: short-term load forecasting, special days, time series, multiple equations, parallelization, clustering
Procedia PDF Downloads 103323 Study of Aqueous Solutions: A Dielectric Spectroscopy Approach
Authors: Kumbharkhane Ashok
Abstract:
The time domain dielectric relaxation spectroscopy (TDRS) probes the interaction of a macroscopic sample with a time-dependent electrical field. The resulting complex permittivity spectrum, characterizes amplitude (voltage) and time scale of the charge-density fluctuations within the sample. These fluctuations may arise from the reorientation of the permanent dipole moments of individual molecules or from the rotation of dipolar moieties in flexible molecules, like polymers. The time scale of these fluctuations depends on the sample and its relative relaxation mechanism. Relaxation times range from some picoseconds in low viscosity liquids to hours in glasses, Therefore the DRS technique covers an extensive dynamical process, its corresponding frequency range from 10-4 Hz to 1012 Hz. This inherent ability to monitor the cooperative motion of molecular ensemble distinguishes dielectric relaxation from methods like NMR or Raman spectroscopy which yield information on the motions of individual molecules. An experimental set up for Time Domain Reflectometry (TDR) technique from 10 MHz to 30 GHz has been developed for the aqueous solutions. This technique has been very simple and covers a wide band of frequencies in the single measurement. Dielectric Relaxation Spectroscopy is especially sensitive to intermolecular interactions. The complex permittivity spectra of aqueous solutions have been fitted using Cole-Davidson (CD) model to determine static dielectric constants and relaxation times for entire concentrations. The heterogeneous molecular interactions in aqueous solutions have been discussed through Kirkwood correlation factor and excess properties.Keywords: liquid, aqueous solutions, time domain reflectometry
Procedia PDF Downloads 444322 Liver Lesion Extraction with Fuzzy Thresholding in Contrast Enhanced Ultrasound Images
Authors: Abder-Rahman Ali, Adélaïde Albouy-Kissi, Manuel Grand-Brochier, Viviane Ladan-Marcus, Christine Hoeffl, Claude Marcus, Antoine Vacavant, Jean-Yves Boire
Abstract:
In this paper, we present a new segmentation approach for focal liver lesions in contrast enhanced ultrasound imaging. This approach, based on a two-cluster Fuzzy C-Means methodology, considers type-II fuzzy sets to handle uncertainty due to the image modality (presence of speckle noise, low contrast, etc.), and to calculate the optimum inter-cluster threshold. Fine boundaries are detected by a local recursive merging of ambiguous pixels. The method has been tested on a representative database. Compared to both Otsu and type-I Fuzzy C-Means techniques, the proposed method significantly reduces the segmentation errors.Keywords: defuzzification, fuzzy clustering, image segmentation, type-II fuzzy sets
Procedia PDF Downloads 485321 Analytical Study of Data Mining Techniques for Software Quality Assurance
Authors: Mariam Bibi, Rubab Mehboob, Mehreen Sirshar
Abstract:
Satisfying the customer requirements is the ultimate goal of producing or developing any product. The quality of the product is decided on the bases of the level of customer satisfaction. There are different techniques which have been reported during the survey which enhance the quality of the product through software defect prediction and by locating the missing software requirements. Some mining techniques were proposed to assess the individual performance indicators in collaborative environment to reduce errors at individual level. The basic intention is to produce a product with zero or few defects thereby producing a best product quality wise. In the analysis of survey the techniques like Genetic algorithm, artificial neural network, classification and clustering techniques and decision tree are studied. After analysis it has been discovered that these techniques contributed much to the improvement and enhancement of the quality of the product.Keywords: data mining, defect prediction, missing requirements, software quality
Procedia PDF Downloads 467320 Business Intelligence for Profiling of Telecommunication Customer
Authors: Rokhmatul Insani, Hira Laksmiwati Soemitro
Abstract:
Business Intelligence is a methodology that exploits the data to produce information and knowledge systematically, business intelligence can support the decision-making process. Some methods in business intelligence are data warehouse and data mining. A data warehouse can store historical data from transactional data. For data modelling in data warehouse, we apply dimensional modelling by Kimball. While data mining is used to extracting patterns from the data and get insight from the data. Data mining has many techniques, one of which is segmentation. For profiling of telecommunication customer, we use customer segmentation according to customer’s usage of services, customer invoice and customer payment. Customers can be grouped according to their characteristics and can be identified the profitable customers. We apply K-Means Clustering Algorithm for segmentation. The input variable for that algorithm we use RFM (Recency, Frequency and Monetary) model. All process in data mining, we use tools IBM SPSS modeller.Keywords: business intelligence, customer segmentation, data warehouse, data mining
Procedia PDF Downloads 483319 Learning Grammars for Detection of Disaster-Related Micro Events
Authors: Josef Steinberger, Vanni Zavarella, Hristo Tanev
Abstract:
Natural disasters cause tens of thousands of victims and massive material damages. We refer to all those events caused by natural disasters, such as damage on people, infrastructure, vehicles, services and resource supply, as micro events. This paper addresses the problem of micro - event detection in online media sources. We present a natural language grammar learning algorithm and apply it to online news. The algorithm in question is based on distributional clustering and detection of word collocations. We also explore the extraction of micro-events from social media and describe a Twitter mining robot, who uses combinations of keywords to detect tweets which talk about effects of disasters.Keywords: online news, natural language processing, machine learning, event extraction, crisis computing, disaster effects, Twitter
Procedia PDF Downloads 478318 An Embarrassingly Simple Semi-supervised Approach to Increase Recall in Online Shopping Domain to Match Structured Data with Unstructured Data
Authors: Sachin Nagargoje
Abstract:
Complete labeled data is often difficult to obtain in a practical scenario. Even if one manages to obtain the data, the quality of the data is always in question. In shopping vertical, offers are the input data, which is given by advertiser with or without a good quality of information. In this paper, an author investigated the possibility of using a very simple Semi-supervised learning approach to increase the recall of unhealthy offers (has badly written Offer Title or partial product details) in shopping vertical domain. The author found that the semisupervised learning method had improved the recall in the Smart Phone category by 30% on A=B testing on 10% traffic and increased the YoY (Year over Year) number of impressions per month by 33% at production. This also made a significant increase in Revenue, but that cannot be publicly disclosed.Keywords: semi-supervised learning, clustering, recall, coverage
Procedia PDF Downloads 122317 Computing Customer Lifetime Value in E-Commerce Websites with Regard to Returned Orders and Payment Method
Authors: Morteza Giti
Abstract:
As online shopping is becoming increasingly popular, computing customer lifetime value for better knowing the customers is also gaining more importance. Two distinct factors that can affect the value of a customer in the context of online shopping is the number of returned orders and payment method. Returned orders are those which have been shipped but not collected by the customer and are returned to the store. Payment method refers to the way that customers choose to pay for the price of the order which are usually two: Pre-pay and Cash-on-delivery. In this paper, a novel model called RFMSP is presented to calculated the customer lifetime value, taking these two parameters into account. The RFMSP model is based on the common RFM model while adding two extra parameter. The S represents the order status and the P indicates the payment method. As a case study for this model, the purchase history of customers in an online shop is used to compute the customer lifetime value over a period of twenty months.Keywords: RFMSP model, AHP, customer lifetime value, k-means clustering, e-commerce
Procedia PDF Downloads 320316 Live Concert Performances in Preschool: Requirements of a Successful Concert for Young Children
Authors: Mei-Ying Liao
Abstract:
The main purpose of this study was to examine the requirements of a successful concert for young children in preschool in Taiwan. This study reports a case study of a preschool’s experience which undertook ten concerts for young children. The main audiences were young children who were two to six years of age. The performers, including children’s family, amateurs and professional performers, were invited to perform music instruments or singing twice a week. The performers participated in these concerts separately, as a solo or ensemble performance. There were totally ten concerts. The structure of concert included the performance, musical activities, questions and answers, song requests, and exploration of instruments. Data collection included interviews with children, teachers and performers, concert observations, and footnotes. Results showed that the requirements of a successful and meaningful concert for young children were suggested to include concert preparation, concert, and post activities. The concert organizer, host and classroom teachers played vital roles for a successful concert. The organizer had to organize the programs and prepared for the concerts based on the needs and interests of their audience of young children, engage their attention and offer the potential to expand their musical worlds. The hosts had to build a bridge between performers and young children who had to know how they could delight and educate children. Concerts combined games, storytelling, instrument exploration and great music had great effects. Finally, the classroom teachers had to do the extension activities after the concerts so that the children will involve more and get more enthusiasm in concerts.Keywords: case study, concert, music education, performance
Procedia PDF Downloads 346315 PDDA: Priority-Based, Dynamic Data Aggregation Approach for Sensor-Based Big Data Framework
Authors: Lutful Karim, Mohammed S. Al-kahtani
Abstract:
Sensors are being used in various applications such as agriculture, health monitoring, air and water pollution monitoring, traffic monitoring and control and hence, play the vital role in the growth of big data. However, sensors collect redundant data. Thus, aggregating and filtering sensors data are significantly important to design an efficient big data framework. Current researches do not focus on aggregating and filtering data at multiple layers of sensor-based big data framework. Thus, this paper introduces (i) three layers data aggregation and framework for big data and (ii) a priority-based, dynamic data aggregation scheme (PDDA) for the lowest layer at sensors. Simulation results show that the PDDA outperforms existing tree and cluster-based data aggregation scheme in terms of overall network energy consumptions and end-to-end data transmission delay.Keywords: big data, clustering, tree topology, data aggregation, sensor networks
Procedia PDF Downloads 346314 TMBCoI-SIOT: Trust Management System Based on the Community of Interest for the Social Internet of Things
Authors: Oumaima Ben Abderrahim, Mohamed Houcine Elhedhili, Leila Saidane
Abstract:
In this paper, we propose a trust management system based on clustering architecture for the social internet of things called TMBCO-SIOT. The proposed model integrates numerous factors such as direct and indirect trust; transaction factor; precaution factor; and social modeling of trust. The novelty of our approach can be summed up in two aspects. The first aspect concerns the architecture based on the community of interest (CoT) where each community is headed by an administrator (admin). However, the second aspect is the trust management system that tries to prevent On-Off attacks and mitigates dishonest recommendations using the k-means algorithm and guarantor things. The effectiveness of the proposed system is proved by simulation against malicious nodes.Keywords: IoT, trust management system, attacks, trust, dishonest recommendations, K-means algorithm
Procedia PDF Downloads 212313 E-Hailing Taxi Industry Management Mode Innovation Based on the Credit Evaluation
Authors: Yuan-lin Liu, Ye Li, Tian Xia
Abstract:
There are some shortcomings in Chinese existing taxi management modes. This paper suggests to establish the third-party comprehensive information management platform and put forward an evaluation model based on credit. Four indicators are used to evaluate the drivers’ credit, they are passengers’ evaluation score, driving behavior evaluation, drivers’ average bad record number, and personal credit score. A weighted clustering method is used to achieve credit level evaluation for taxi drivers. The management of taxi industry is based on the credit level, while the grade of the drivers is accorded to their credit rating. Credit rating determines the cost, income levels, the market access, useful period of license and the level of wage and bonus, as well as violation fine. These methods can make the credit evaluation effective. In conclusion, more credit data will help to set up a more accurate and detailed classification standard library.Keywords: credit, mobile internet, e-hailing taxi, management mode, weighted cluster
Procedia PDF Downloads 325312 Evaluation of Groundwater Quality and Its Suitability for Drinking and Agricultural Purposes Using Self-Organizing Maps
Authors: L. Belkhiri, L. Mouni, A. Tiri, T.S. Narany
Abstract:
In the present study, the self-organizing map (SOM) clustering technique was applied to identify homogeneous clusters of hydrochemical parameters in El Milia plain, Algeria, to assess the quality of groundwater for potable and agricultural purposes. The visualization of SOM-analysis indicated that 35 groundwater samples collected in the study area were classified into three clusters, which showed progressive increase in electrical conductivity from cluster one to cluster three. Samples belonging to cluster one are mostly located in the recharge zone showing hard fresh water type, however, water type gradually changed to hard-brackish type in the discharge zone, including clusters two and three. Ionic ratio studies indicated the role of carbonate rock dissolution in increases on groundwater hardness, especially in cluster one. However, evaporation and evapotranspiration are the main processes increasing salinity in cluster two and three.Keywords: groundwater quality, self-organizing maps, drinking water, irrigation water
Procedia PDF Downloads 256