Search results for: video mining
Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 967

Search results for: video mining

97 Effects of Livestream Affordances on Consumer Purchase Willingness: Explicit IT Affordances Perspective

Authors: Isaac O. Asante, Yushi Jiang, Hailin Tao

Abstract:

Livestreaming marketing, the new electronic commerce element, has become an optional marketing channel following the COVID-19 pandemic, and many sellers are leveraging the features presented by livestreaming to increase sales. This study was conducted to measure real-time observable interactions between consumers and sellers. Based on the affordance theory, this study conceptualized constructs representing the interactive features and examined how they drive consumers’ purchase willingness during livestreaming sessions using 1238 datasets from Amazon Live, following the manual observation of transaction records. Using structural equation modeling, the ordinary least square regression suggests that live viewers, new followers, live chats, and likes positively affect purchase willingness. The Sobel and Monte Carlo tests show that new followers, live chats, and likes significantly mediate the relationship between live viewers and purchase willingness. The study presents a way of measuring interactions in livestreaming commerce and proposes a way to manually gather data on consumer behaviors in livestreaming platforms when the application programming interface (API) of such platforms does not support data mining algorithms.

Keywords: Livestreaming marketing, live chats, live viewers, likes, new followers, purchase willingness.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 126
96 Development of an Ensemble Classification Model Based on Hybrid Filter-Wrapper Feature Selection for Email Phishing Detection

Authors: R. B. Ibrahim, M. S. Argungu, I. M. Mungadi

Abstract:

It is obvious in this present time, internet has become an indispensable part of human life since its inception. The Internet has provided diverse opportunities to make life so easy for human beings, through the adoption of various channels. Among these channels are email, internet banking, video conferencing, and the like. Email is one of the easiest means of communication hugely accepted among individuals and organizations globally. But over decades the security integrity of this platform has been challenged with malicious activities like Phishing. Email phishing is designed by phishers to fool the recipient into handing over sensitive personal information such as passwords, credit card numbers, account credentials, social security numbers, etc. This activity has caused a lot of financial damage to email users globally which has resulted in bankruptcy, sudden death of victims, and other health-related sicknesses. Although many methods have been proposed to detect email phishing, in this research, the results of multiple machine-learning methods for predicting email phishing have been compared with the use of filter-wrapper feature selection. It is worth noting that all three models performed substantially but one outperformed the other. The dataset used for these models is obtained from Kaggle online data repository, while three classifiers: decision tree, Naïve Bayes, and Logistic regression are ensemble (Bagging) respectively. Results from the study show that the Decision Tree (CART) bagging ensemble recorded the highest accuracy of 98.13% using PEF (Phishing Essential Features). This result further demonstrates the dependability of the proposed model.

Keywords: Ensemble, hybrid, filter-wrapper, phishing.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 165
95 Stereo Motion Tracking

Authors: Yudhajit Datta, Jonathan Bandi, Ankit Sethia, Hamsi Iyer

Abstract:

Motion Tracking and Stereo Vision are complicated, albeit well-understood problems in computer vision. Existing softwares that combine the two approaches to perform stereo motion tracking typically employ complicated and computationally expensive procedures. The purpose of this study is to create a simple and effective solution capable of combining the two approaches. The study aims to explore a strategy to combine the two techniques of two-dimensional motion tracking using Kalman Filter; and depth detection of object using Stereo Vision. In conventional approaches objects in the scene of interest are observed using a single camera. However for Stereo Motion Tracking; the scene of interest is observed using video feeds from two calibrated cameras. Using two simultaneous measurements from the two cameras a calculation for the depth of the object from the plane containing the cameras is made. The approach attempts to capture the entire three-dimensional spatial information of each object at the scene and represent it through a software estimator object. In discrete intervals, the estimator tracks object motion in the plane parallel to plane containing cameras and updates the perpendicular distance value of the object from the plane containing the cameras as depth. The ability to efficiently track the motion of objects in three-dimensional space using a simplified approach could prove to be an indispensable tool in a variety of surveillance scenarios. The approach may find application from high security surveillance scenes such as premises of bank vaults, prisons or other detention facilities; to low cost applications in supermarkets and car parking lots.

Keywords: Kalman Filter, Stereo Vision, Motion Tracking, Matlab, Object Tracking, Camera Calibration, Computer Vision System Toolbox.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2820
94 Analysis of Driver Point of Regard Determinations with Eye-Gesture Templates Using Receiver Operating Characteristic

Authors: Siti Nor Hafizah binti Mohd Zaid, Mohamed Abdel-Maguid, Abdel-Hamid Soliman

Abstract:

An Advance Driver Assistance System (ADAS) is a computer system on board a vehicle which is used to reduce the risk of vehicular accidents by monitoring factors relating to the driver, vehicle and environment and taking some action when a risk is identified. Much work has been done on assessing vehicle and environmental state but there is still comparatively little published work that tackles the problem of driver state. Visual attention is one such driver state. In fact, some researchers claim that lack of attention is the main cause of accidents as factors such as fatigue, alcohol or drug use, distraction and speeding all impair the driver-s capacity to pay attention to the vehicle and road conditions [1]. This seems to imply that the main cause of accidents is inappropriate driver behaviour in cases where the driver is not giving full attention while driving. The work presented in this paper proposes an ADAS system which uses an image based template matching algorithm to detect if a driver is failing to observe particular windscreen cells. This is achieved by dividing the windscreen into 24 uniform cells (4 rows of 6 columns) and matching video images of the driver-s left eye with eye-gesture templates drawn from images of the driver looking at the centre of each windscreen cell. The main contribution of this paper is to assess the accuracy of this approach using Receiver Operating Characteristic analysis. The results of our evaluation give a sensitivity value of 84.3% and a specificity value of 85.0% for the eye-gesture template approach indicating that it may be useful for driver point of regard determinations.

Keywords: Advanced Driver Assistance Systems, Eye-Tracking, Hazard Detection.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1627
93 Development of Prediction Models of Day-Ahead Hourly Building Electricity Consumption and Peak Power Demand Using the Machine Learning Method

Authors: Dalin Si, Azizan Aziz, Bertrand Lasternas

Abstract:

To encourage building owners to purchase electricity at the wholesale market and reduce building peak demand, this study aims to develop models that predict day-ahead hourly electricity consumption and demand using artificial neural network (ANN) and support vector machine (SVM). All prediction models are built in Python, with tool Scikit-learn and Pybrain. The input data for both consumption and demand prediction are time stamp, outdoor dry bulb temperature, relative humidity, air handling unit (AHU), supply air temperature and solar radiation. Solar radiation, which is unavailable a day-ahead, is predicted at first, and then this estimation is used as an input to predict consumption and demand. Models to predict consumption and demand are trained in both SVM and ANN, and depend on cooling or heating, weekdays or weekends. The results show that ANN is the better option for both consumption and demand prediction. It can achieve 15.50% to 20.03% coefficient of variance of root mean square error (CVRMSE) for consumption prediction and 22.89% to 32.42% CVRMSE for demand prediction, respectively. To conclude, the presented models have potential to help building owners to purchase electricity at the wholesale market, but they are not robust when used in demand response control.

Keywords: Building energy prediction, data mining, demand response, electricity market.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2199
92 ARMrayan Multimedia Mobile CMS: a Simplified Approach towards Content-Oriented Mobile Application Designing

Authors: Ali Reza Manashty, Mohammad Reza Ahmadzadeh Raji, Zahra Forootan Jahromi, Amir Rajabzadeh

Abstract:

The ARMrayan Multimedia Mobile CMS (Content Management System) is the first mobile CMS that gives the opportunity to users for creating multimedia J2ME mobile applications with their desired content, design and logo; simply, without any need for writing even a line of code. The low-level programming and compatibility problems of the J2ME, along with UI designing difficulties, makes it hard for most people –even programmers- to broadcast their content to the widespread mobile phones used by nearly all people. This system provides user-friendly, PC-based tools for creating a tree index of pages and inserting multiple multimedia contents (e.g. sound, video and picture) in each page for creating a J2ME mobile application. The output is a standalone Java mobile application that has a user interface, shows texts and pictures and plays music and videos regardless of the type of devices used as long as the devices support the J2ME platform. Bitmap fonts have also been used thus Middle Eastern languages can be easily supported on all mobile phone devices. We omitted programming concepts for users in order to simplify multimedia content-oriented mobile applictaion designing for use in educational, cultural or marketing centers. Ordinary operators can now create a variety of multimedia mobile applications such as tutorials, catalogues, books, and guides in minutes rather than months. Simplicity and power has been the goal of this CMS. In this paper, we present the software engineered-designed concepts of the ARMrayan MCMS along with the implementation challenges faces and solutions adapted.

Keywords: Mobile CMS, MCMS, Mobile Content Builder, J2ME Application, Multimedia Mobile Application, MultimediaCMS, Multimedia Mobile CMS, Content Management System.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1663
91 Perceptual and Ultrasound Articulatory Training Effects on English L2 Vowels Production by Italian Learners

Authors: I. Sonia d’Apolito, Bianca Sisinni, Mirko Grimaldi, Barbara Gili Fivela

Abstract:

The American English contrast /ɑ-ʌ/ (cop-cup) is difficult to be produced by Italian learners since they realize L2-/ɑ-ʌ/ as L1-/ɔ-a/ respectively, due to differences in phonetic-phonological systems and also in grapheme-to-phoneme conversion rules. In this paper, we try to answer the following research questions: Can a short training improve the production of English /ɑ-ʌ/ by Italian learners? Is a perceptual training better than an articulatory (ultrasound - US) training? Thus, we compare a perceptual training with an US articulatory one to observe: 1) the effects of short trainings on L2-/ɑ-ʌ/ productions; 2) if the US articulatory training improves the pronunciation better than the perceptual training. In this pilot study, 9 Salento-Italian monolingual adults participated: 3 subjects performed a 1-hour perceptual training (ES-P); 3 subjects performed a 1-hour US training (ES-US); and 3 control subjects did not receive any training (CS). Verbal instructions about the phonetic properties of L2-/ɑ-ʌ/ and L1-/ɔ-a/ and their differences (representation on F1-F2 plane) were provided during both trainings. After these instructions, the ES-P group performed an identification training based on the High Variability Phonetic Training procedure, while the ES-US group performed the articulatory training, by means of US video of tongue gestures in L2-/ɑ-ʌ/ production and dynamic view of their own tongue movements and position using a probe under their chin. The acoustic data were analyzed and the first three formants were calculated. Independent t-tests were run to compare: 1) /ɑ-ʌ/ in pre- vs. post-test respectively; /ɑ-ʌ/ in pre- and post-test vs. L1-/a-ɔ/ respectively. Results show that in the pre-test all speakers realize L2-/ɑ-ʌ/ as L1-/ɔ-a/ respectively. Contrary to CS and ES-P groups, the ES-US group in the post-test differentiates the L2 vowels from those produced in the pre-test as well as from the L1 vowels, although only one ES-US subject produces both L2 vowels accurately. The articulatory training seems more effective than the perceptual one since it favors the production of vowels in the correct direction of L2 vowels and differently from the similar L1 vowels.

Keywords: L2 vowel production, perceptual training, articulatory training, ultrasound.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1015
90 Virulent-GO: Prediction of Virulent Proteins in Bacterial Pathogens Utilizing Gene Ontology Terms

Authors: Chia-Ta Tsai, Wen-Lin Huang, Shinn-Jang Ho, Li-Sun Shu, Shinn-Ying Ho

Abstract:

Prediction of bacterial virulent protein sequences can give assistance to identification and characterization of novel virulence-associated factors and discover drug/vaccine targets against proteins indispensable to pathogenicity. Gene Ontology (GO) annotation which describes functions of genes and gene products as a controlled vocabulary of terms has been shown effectively for a variety of tasks such as gene expression study, GO annotation prediction, protein subcellular localization, etc. In this study, we propose a sequence-based method Virulent-GO by mining informative GO terms as features for predicting bacterial virulent proteins. Each protein in the datasets used by the existing method VirulentPred is annotated by using BLAST to obtain its homologies with known accession numbers for retrieving GO terms. After investigating various popular classifiers using the same five-fold cross-validation scheme, Virulent-GO using the single kind of GO term features with an accuracy of 82.5% is slightly better than VirulentPred with 81.8% using five kinds of sequence-based features. For the evaluation of independent test, Virulent-GO also yields better results (82.0%) than VirulentPred (80.7%). When evaluating single kind of feature with SVM, the GO term feature performs much well, compared with each of the five kinds of features.

Keywords: Bacterial virulence factors, GO terms, prediction, protein sequence.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2182
89 Development of a Technology Assessment Model by Patents and Customers' Review Data

Authors: Kisik Song, Sungjoo Lee

Abstract:

Recent years have seen an increasing number of patent disputes due to excessive competition in the global market and a reduced technology life-cycle; this has increased the risk of investment in technology development. While many global companies have started developing a methodology to identify promising technologies and assess for decisions, the existing methodology still has some limitations. Post hoc assessments of the new technology are not being performed, especially to determine whether the suggested technologies turned out to be promising. For example, in existing quantitative patent analysis, a patent’s citation information has served as an important metric for quality assessment, but this analysis cannot be applied to recently registered patents because such information accumulates over time. Therefore, we propose a new technology assessment model that can replace citation information and positively affect technological development based on post hoc analysis of the patents for promising technologies. Additionally, we collect customer reviews on a target technology to extract keywords that show the customers’ needs, and we determine how many keywords are covered in the new technology. Finally, we construct a portfolio (based on a technology assessment from patent information) and a customer-based marketability assessment (based on review data), and we use them to visualize the characteristics of the new technologies.

Keywords: Technology assessment, patents, citation information, opinion mining.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 985
88 Data Privacy and Safety with Large Language Models

Authors: Ashly Joseph, Jithu Paulose

Abstract:

Large language models (LLMs) have revolutionized natural language processing capabilities, enabling applications such as chatbots, dialogue agents, image, and video generators. Nevertheless, their trainings on extensive datasets comprising personal information poses notable privacy and safety hazards. This study examines methods for addressing these challenges, specifically focusing on approaches to enhance the security of LLM outputs, safeguard user privacy, and adhere to data protection rules. We explore several methods including post-processing detection algorithms, content filtering, reinforcement learning from human and AI inputs, and the difficulties in maintaining a balance between model safety and performance. The study also emphasizes the dangers of unintentional data leakage, privacy issues related to user prompts, and the possibility of data breaches. We highlight the significance of corporate data governance rules and optimal methods for engaging with chatbots. In addition, we analyze the development of data protection frameworks, evaluate the adherence of LLMs to General Data Protection Regulation (GDPR), and examine privacy legislation in academic and business policies. We demonstrate the difficulties and remedies involved in preserving data privacy and security in the age of sophisticated artificial intelligence by employing case studies and real-life instances. This article seeks to educate stakeholders on practical strategies for improving the security and privacy of LLMs, while also assuring their responsible and ethical implementation.

Keywords: Data privacy, large language models, artificial intelligence, machine learning, cybersecurity, general data protection regulation, data safety.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 71
87 How Children Synchronize with Their Teacher: Evidence from a Real-World Elementary School Classroom

Authors: Reiko Yamamoto

Abstract:

This paper reports on how synchrony occurs between children and their teacher, and what prevents or facilitates synchrony. The aim of the experiment conducted in this study was to precisely analyze their movements and synchrony and reveal the process of synchrony in a real-world classroom. Specifically, the experiment was conducted for around 20 minutes during an English as a foreign language (EFL) lesson. The participants were 11 fourth-grade school children and their classroom teacher in a public elementary school in Japan. Previous researchers assert that synchrony causes the state of flow in a class. For checking the level of flow, Short Flow State Scale (SFSS) was adopted. The experimental procedure had four steps: 1) The teacher read aloud the first half of an English storybook to the children. Both the teacher and the children were at their own desks. 2) The children were subjected to an SFSS check. 3) The teacher read aloud the remaining half of the storybook to the children. She made the children remove their desks before reading. 4) The children were again subjected to an SFSS check. The movements of all participants were recorded with a video camera. From the movement analysis, it was found that the children synchronized better with the teacher in Step 3 than in Step 1, and that the teacher’s movement became free and outstanding without a desk. This implies that the desk acted as a barrier between the children and the teacher. Removal of this barrier resulted in the children’s reactions becoming synchronized with those of the teacher. The SFSS results proved that the children experienced more flow without a barrier than with a barrier. Apparently, synchrony is what caused flow or social emotions in the classroom. The main conclusion is that synchrony leads to cognitive outcomes such as children’s academic performance in EFL learning.

Keywords: Movement synchrony, teacher–child relationships, English as a foreign language, EFL learning.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 683
86 A Methodology for Automatic Diversification of Document Categories

Authors: Dasom Kim, Chen Liu, Myungsu Lim, Soo-Hyeon Jeon, Byeoung Kug Jeon, Kee-Young Kwahk, Namgyu Kim

Abstract:

Recently, numerous documents including large volumes of unstructured data and text have been created because of the rapid increase in the use of social media and the Internet. Usually, these documents are categorized for the convenience of users. Because the accuracy of manual categorization is not guaranteed, and such categorization requires a large amount of time and incurs huge costs. Many studies on automatic categorization have been conducted to help mitigate the limitations of manual categorization. Unfortunately, most of these methods cannot be applied to categorize complex documents with multiple topics because they work on the assumption that individual documents can be categorized into single categories only. Therefore, to overcome this limitation, some studies have attempted to categorize each document into multiple categories. However, the learning process employed in these studies involves training using a multi-categorized document set. These methods therefore cannot be applied to the multi-categorization of most documents unless multi-categorized training sets using traditional multi-categorization algorithms are provided. To overcome this limitation, in this study, we review our novel methodology for extending the category of a single-categorized document to multiple categorizes, and then introduce a survey-based verification scenario for estimating the accuracy of our automatic categorization methodology.

Keywords: Big Data Analysis, Document Classification, Text Mining, Topic Analysis.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1741
85 Revitalisation of Indigenous Food in Africa through Print and Electronic Media

Authors: Adebisi. Elizabeth, Banjo

Abstract:

Language and culture are interwoven that they cannot be separated, for the knowledge of a language cannot be complete without having the culture of the language. Indigenous food is a cultural aspect of any language that is expected to be acquired by all the speakers of the language. Indigenous food is known to be vital right from early years, which is also attributed to the healthy living of the ancient people. However it is discovered that the indigenous food is almost being replaced by fast food products such as Indomie noodles, Spaghetti and Macaroni to the extent that majority of the young folks prefer the eating of the fast foods and cannot prepare the indigenous foods which are good for growth and healthy living of people. Therefore, there is need to revitalize and re-educate people on the indigenous food which is an aspect of inter-cultural education of any language to prevent it from being forgotten or neglected.

African foods are many, but this study focused on Nigerian food using some Yoruba dishes as a case study. Examples of Yoruba dishes are pounded yam and melon with vegetable and dried fish soup, beans pudding (moin moin) and pap (eko), water yam pudding with fish and meat (ikokore) and many more. The ingredients needed for the preparation of these indigenous foods contain some basic food nutrients which will be analyzed and their nutritional importance to human bodies will also be discussed.

The process of re- awakening the education of indigenous food to the present and up-coming generation should be via print and electronic media in form of advertisements on posters, billboards, calendars and in rhymes on television programs, radio presentations, video tapes and CD–ROM apart from classroom teaching and learning. Indigenous food is a panacea to healthy living and longevity, a prevention of diseases and a means of accelerated healing of the body through natural foods.

Keywords: Indigenous food, print and electronic media, nutritional values, re-awakening education.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2186
84 Recommender Systems Using Ensemble Techniques

Authors: Yeonjeong Lee, Kyoung-jae Kim, Youngtae Kim

Abstract:

This study proposes a novel recommender system that uses data mining and multi-model ensemble techniques to enhance the recommendation performance through reflecting the precise user’s preference. The proposed model consists of two steps. In the first step, this study uses logistic regression, decision trees, and artificial neural networks to predict customers who have high likelihood to purchase products in each product group. Then, this study combines the results of each predictor using the multi-model ensemble techniques such as bagging and bumping. In the second step, this study uses the market basket analysis to extract association rules for co-purchased products. Finally, the system selects customers who have high likelihood to purchase products in each product group and recommends proper products from same or different product groups to them through above two steps. We test the usability of the proposed system by using prototype and real-world transaction and profile data. In addition, we survey about user satisfaction for the recommended product list from the proposed system and the randomly selected product lists. The results also show that the proposed system may be useful in real-world online shopping store.

Keywords: Product recommender system, Ensemble technique, Association rules, Decision tree, Artificial neural networks.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 4215
83 Strategic Mine Planning: A SWOT Analysis Applied to KOV Open Pit Mine in the Democratic Republic of Congo

Authors: Patrick May Mukonki

Abstract:

KOV pit (Kamoto Oliveira Virgule) is located 10 km from Kolwezi town, one of the mineral rich town in the Lualaba province of the Democratic Republic of Congo. The KOV pit is currently operating under the Katanga Mining Limited (KML), a Glencore-Gecamines (a State Owned Company) join venture. Recently, the mine optimization process provided a life of mine of approximately 10 years withnice pushbacks using the Datamine NPV Scheduler software. In previous KOV pit studies, we recently outlined the impact of the accuracy of the geological information on a long-term mine plan for a big copper mine such as KOV pit. The approach taken, discussed three main scenarios and outlined some weaknesses on the geological information side, and now, in this paper that we are going to develop here, we are going to highlight, as an overview, those weaknesses, strengths and opportunities, in a global SWOT analysis. The approach we are taking here is essentially descriptive in terms of steps taken to optimize KOV pit and, at every step, we categorized the challenges we faced to have a better tradeoff between what we called strengths and what we called weaknesses. The same logic is applied in terms of the opportunities and threats. The SWOT analysis conducted in this paper demonstrates that, despite a general poor ore body definition, and very rude ground water conditions, there is room for improvement for such high grade ore body.

Keywords: Mine planning, mine optimization, mine scheduling, SWOT analysis.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1578
82 SNC Based Network Layer Design for Underwater Wireless Communication Used in Coral Farms

Authors: T. T. Manikandan, Rajeev Sukumaran

Abstract:

For maintaining the biodiversity of many ecosystems the existence of coral reefs play a vital role. But due to many factors such as pollution and coral mining, coral reefs are dying day by day. One way to protect the coral reefs is to farm them in a carefully monitored underwater environment and restore it in place of dead corals. For successful farming of corals in coral farms, different parameters of the water in the farming area need to be monitored and maintained at optimal level. Sensing underwater parameters using wireless sensor nodes is an effective way for precise and continuous monitoring in a highly dynamic environment like oceans. Here the sensed information is of varying importance and it needs to be provided with desired Quality of Service(QoS) guarantees in delivering the information to offshore monitoring centers. The main interest of this research is Stochastic Network Calculus (SNC) based modeling of network layer design for underwater wireless sensor communication. The model proposed in this research enforces differentiation of service in underwater wireless sensor communication with the help of buffer sizing and link scheduling. The delay and backlog bounds for such differentiated services are analytically derived using stochastic network calculus.

Keywords: Underwater Coral Farms, SNC, differentiated service, delay bound, backlog bound.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 348
81 Malware Beaconing Detection by Mining Large-scale DNS Logs for Targeted Attack Identification

Authors: Andrii Shalaginov, Katrin Franke, Xiongwei Huang

Abstract:

One of the leading problems in Cyber Security today is the emergence of targeted attacks conducted by adversaries with access to sophisticated tools. These attacks usually steal senior level employee system privileges, in order to gain unauthorized access to confidential knowledge and valuable intellectual property. Malware used for initial compromise of the systems are sophisticated and may target zero-day vulnerabilities. In this work we utilize common behaviour of malware called ”beacon”, which implies that infected hosts communicate to Command and Control servers at regular intervals that have relatively small time variations. By analysing such beacon activity through passive network monitoring, it is possible to detect potential malware infections. So, we focus on time gaps as indicators of possible C2 activity in targeted enterprise networks. We represent DNS log files as a graph, whose vertices are destination domains and edges are timestamps. Then by using four periodicity detection algorithms for each pair of internal-external communications, we check timestamp sequences to identify the beacon activities. Finally, based on the graph structure, we infer the existence of other infected hosts and malicious domains enrolled in the attack activities.

Keywords: Malware detection, network security, targeted attack.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 6093
80 Agile Methodology for Modeling and Design of Data Warehouses -AM4DW-

Authors: Nieto Bernal Wilson, Carmona Suarez Edgar

Abstract:

The organizations have structured and unstructured information in different formats, sources, and systems. Part of these come from ERP under OLTP processing that support the information system, however these organizations in OLAP processing level, presented some deficiencies, part of this problematic lies in that does not exist interesting into extract knowledge from their data sources, as also the absence of operational capabilities to tackle with these kind of projects.  Data Warehouse and its applications are considered as non-proprietary tools, which are of great interest to business intelligence, since they are repositories basis for creating models or patterns (behavior of customers, suppliers, products, social networks and genomics) and facilitate corporate decision making and research. The following paper present a structured methodology, simple, inspired from the agile development models as Scrum, XP and AUP. Also the models object relational, spatial data models, and the base line of data modeling under UML and Big data, from this way sought to deliver an agile methodology for the developing of data warehouses, simple and of easy application. The methodology naturally take into account the application of process for the respectively information analysis, visualization and data mining, particularly for patterns generation and derived models from the objects facts structured.

Keywords: Data warehouse, model data, big data, object fact, object relational fact, process developed data warehouse.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1473
79 Image Classification and Accuracy Assessment Using the Confusion Matrix, Contingency Matrix, and Kappa Coefficient

Authors: F. F. Howard, C. B. Boye, I. Yakubu, J. S. Y. Kuma

Abstract:

One of the ways that could be used for the production of land use and land cover maps by a procedure known as image classification is the use of the remote sensing technique. Numerous elements ought to be taken into consideration, including the availability of highly satisfactory Landsat imagery, secondary data and a precise classification process. The goal of this study was to classify and map the land use and land cover of the study area using remote sensing and Geospatial Information System (GIS) analysis. The classification was done using Landsat 8 satellite images acquired in December 2020 covering the study area. The Landsat image was downloaded from the USGS. The Landsat image with 30 m resolution was geo-referenced to the WGS_84 datum and Universal Transverse Mercator (UTM) Zone 30N coordinate projection system. A radiometric correction was applied to the image to reduce the noise in the image. This study consists of two sections: the Land Use/Land Cover (LULC) and Accuracy Assessments using the confusion and contingency matrix and the Kappa coefficient. The LULC classifications were vegetation (agriculture) (67.87%), water bodies (0.01%), mining areas (5.24%), forest (26.02%), and settlement (0.88%). The overall accuracy of 97.87% and the kappa coefficient (K) of 97.3% were obtained for the confusion matrix. While an overall accuracy of 95.7% and a Kappa coefficient of 0.947 were obtained for the contingency matrix, the kappa coefficients were rated as substantial; hence, the classified image is fit for further research.

Keywords: Confusion Matrix, contingency matrix, kappa coefficient, land used/ land cover, accuracy assessment.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 230
78 Fuzzy Wavelet Packet based Feature Extraction Method for Multifunction Myoelectric Control

Authors: Rami N. Khushaba, Adel Al-Jumaily

Abstract:

The myoelectric signal (MES) is one of the Biosignals utilized in helping humans to control equipments. Recent approaches in MES classification to control prosthetic devices employing pattern recognition techniques revealed two problems, first, the classification performance of the system starts degrading when the number of motion classes to be classified increases, second, in order to solve the first problem, additional complicated methods were utilized which increase the computational cost of a multifunction myoelectric control system. In an effort to solve these problems and to achieve a feasible design for real time implementation with high overall accuracy, this paper presents a new method for feature extraction in MES recognition systems. The method works by extracting features using Wavelet Packet Transform (WPT) applied on the MES from multiple channels, and then employs Fuzzy c-means (FCM) algorithm to generate a measure that judges on features suitability for classification. Finally, Principle Component Analysis (PCA) is utilized to reduce the size of the data before computing the classification accuracy with a multilayer perceptron neural network. The proposed system produces powerful classification results (99% accuracy) by using only a small portion of the original feature set.

Keywords: Biomedical Signal Processing, Data mining andInformation Extraction, Machine Learning, Rehabilitation.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1733
77 Fuzzy Relatives of the CLARANS Algorithm With Application to Text Clustering

Authors: Mohamed A. Mahfouz, M. A. Ismail

Abstract:

This paper introduces new algorithms (Fuzzy relative of the CLARANS algorithm FCLARANS and Fuzzy c Medoids based on randomized search FCMRANS) for fuzzy clustering of relational data. Unlike existing fuzzy c-medoids algorithm (FCMdd) in which the within cluster dissimilarity of each cluster is minimized in each iteration by recomputing new medoids given current memberships, FCLARANS minimizes the same objective function minimized by FCMdd by changing current medoids in such away that that the sum of the within cluster dissimilarities is minimized. Computing new medoids may be effected by noise because outliers may join the computation of medoids while the choice of medoids in FCLARANS is dictated by the location of a predominant fraction of points inside a cluster and, therefore, it is less sensitive to the presence of outliers. In FCMRANS the step of computing new medoids in FCMdd is modified to be based on randomized search. Furthermore, a new initialization procedure is developed that add randomness to the initialization procedure used with FCMdd. Both FCLARANS and FCMRANS are compared with the robust and linearized version of fuzzy c-medoids (RFCMdd). Experimental results with different samples of the Reuter-21578, Newsgroups (20NG) and generated datasets with noise show that FCLARANS is more robust than both RFCMdd and FCMRANS. Finally, both FCMRANS and FCLARANS are more efficient and their outputs are almost the same as that of RFCMdd in terms of classification rate.

Keywords: Data Mining, Fuzzy Clustering, Relational Clustering, Medoid-Based Clustering, Cluster Analysis, Unsupervised Learning.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2394
76 Learning Classifier Systems Approach for Automated Discovery of Crisp and Fuzzy Hierarchical Production Rules

Authors: Suraiya Jabin, Kamal K. Bharadwaj

Abstract:

This research presents a system for post processing of data that takes mined flat rules as input and discovers crisp as well as fuzzy hierarchical structures using Learning Classifier System approach. Learning Classifier System (LCS) is basically a machine learning technique that combines evolutionary computing, reinforcement learning, supervised or unsupervised learning and heuristics to produce adaptive systems. A LCS learns by interacting with an environment from which it receives feedback in the form of numerical reward. Learning is achieved by trying to maximize the amount of reward received. Crisp description for a concept usually cannot represent human knowledge completely and practically. In the proposed Learning Classifier System initial population is constructed as a random collection of HPR–trees (related production rules) and crisp / fuzzy hierarchies are evolved. A fuzzy subsumption relation is suggested for the proposed system and based on Subsumption Matrix (SM), a suitable fitness function is proposed. Suitable genetic operators are proposed for the chosen chromosome representation method. For implementing reinforcement a suitable reward and punishment scheme is also proposed. Experimental results are presented to demonstrate the performance of the proposed system.

Keywords: Hierarchical Production Rule, Data Mining, Learning Classifier System, Fuzzy Subsumption Relation, Subsumption matrix, Reinforcement Learning.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1450
75 Visual Text Analytics Technologies for Real-Time Big Data: Chronological Evolution and Issues

Authors: Siti Azrina B. A. Aziz, Siti Hafizah A. Hamid

Abstract:

New approaches to analyze and visualize data stream in real-time basis is important in making a prompt decision by the decision maker. Financial market trading and surveillance, large-scale emergency response and crowd control are some example scenarios that require real-time analytic and data visualization. This situation has led to the development of techniques and tools that support humans in analyzing the source data. With the emergence of Big Data and social media, new techniques and tools are required in order to process the streaming data. Today, ranges of tools which implement some of these functionalities are available. In this paper, we present chronological evolution evaluation of technologies for supporting of real-time analytic and visualization of the data stream. Based on the past research papers published from 2002 to 2014, we gathered the general information, main techniques, challenges and open issues. The techniques for streaming text visualization are identified based on Text Visualization Browser in chronological order. This paper aims to review the evolution of streaming text visualization techniques and tools, as well as to discuss the problems and challenges for each of identified tools.

Keywords: Information visualization, visual analytics, text mining, visual text analytics tools, big data visualization.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 999
74 Intelligent Recognition of Diabetes Disease via FCM Based Attribute Weighting

Authors: Kemal Polat

Abstract:

In this paper, an attribute weighting method called fuzzy C-means clustering based attribute weighting (FCMAW) for classification of Diabetes disease dataset has been used. The aims of this study are to reduce the variance within attributes of diabetes dataset and to improve the classification accuracy of classifier algorithm transforming from non-linear separable datasets to linearly separable datasets. Pima Indians Diabetes dataset has two classes including normal subjects (500 instances) and diabetes subjects (268 instances). Fuzzy C-means clustering is an improved version of K-means clustering method and is one of most used clustering methods in data mining and machine learning applications. In this study, as the first stage, fuzzy C-means clustering process has been used for finding the centers of attributes in Pima Indians diabetes dataset and then weighted the dataset according to the ratios of the means of attributes to centers of theirs. Secondly, after weighting process, the classifier algorithms including support vector machine (SVM) and k-NN (k- nearest neighbor) classifiers have been used for classifying weighted Pima Indians diabetes dataset. Experimental results show that the proposed attribute weighting method (FCMAW) has obtained very promising results in the classification of Pima Indians diabetes dataset.

Keywords: Fuzzy C-means clustering, Fuzzy C-means clustering based attribute weighting, Pima Indians diabetes dataset, SVM.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1757
73 A Fuzzy-Rough Feature Selection Based on Binary Shuffled Frog Leaping Algorithm

Authors: Javad Rahimipour Anaraki, Saeed Samet, Mahdi Eftekhari, Chang Wook Ahn

Abstract:

Feature selection and attribute reduction are crucial problems, and widely used techniques in the field of machine learning, data mining and pattern recognition to overcome the well-known phenomenon of the Curse of Dimensionality. This paper presents a feature selection method that efficiently carries out attribute reduction, thereby selecting the most informative features of a dataset. It consists of two components: 1) a measure for feature subset evaluation, and 2) a search strategy. For the evaluation measure, we have employed the fuzzy-rough dependency degree (FRFDD) of the lower approximation-based fuzzy-rough feature selection (L-FRFS) due to its effectiveness in feature selection. As for the search strategy, a modified version of a binary shuffled frog leaping algorithm is proposed (B-SFLA). The proposed feature selection method is obtained by hybridizing the B-SFLA with the FRDD. Nine classifiers have been employed to compare the proposed approach with several existing methods over twenty two datasets, including nine high dimensional and large ones, from the UCI repository. The experimental results demonstrate that the B-SFLA approach significantly outperforms other metaheuristic methods in terms of the number of selected features and the classification accuracy.

Keywords: Binary shuffled frog leaping algorithm, feature selection, fuzzy-rough set, minimal reduct.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 723
72 Relevance Feedback within CBIR Systems

Authors: Mawloud Mosbah, Bachir Boucheham

Abstract:

We present here the results for a comparative study of some techniques, available in the literature, related to the relevance feedback mechanism in the case of a short-term learning. Only one method among those considered here is belonging to the data mining field which is the K-nearest neighbors algorithm (KNN) while the rest of the methods is related purely to the information retrieval field and they fall under the purview of the following three major axes: Shifting query, Feature Weighting and the optimization of the parameters of similarity metric. As a contribution, and in addition to the comparative purpose, we propose a new version of the KNN algorithm referred to as an incremental KNN which is distinct from the original version in the sense that besides the influence of the seeds, the rate of the actual target image is influenced also by the images already rated. The results presented here have been obtained after experiments conducted on the Wang database for one iteration and utilizing color moments on the RGB space. This compact descriptor, Color Moments, is adequate for the efficiency purposes needed in the case of interactive systems. The results obtained allow us to claim that the proposed algorithm proves good results; it even outperforms a wide range of techniques available in the literature.

Keywords: CBIR, Category Search, Relevance Feedback (RFB), Query Point Movement, Standard Rocchio’s Formula, Adaptive Shifting Query, Feature Weighting, Optimization of the Parameters of Similarity Metric, Original KNN, Incremental KNN.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2338
71 ParkedGuard: An Efficient and Accurate Parked Domain Detection System Using Graphical Locality Analysis and Coarse-To-Fine Strategy

Authors: Chia-Min Lai, Wan-Ching Lin, Hahn-Ming Lee, Ching-Hao Mao

Abstract:

As world wild internet has non-stop developments, making profit by lending registered domain names emerges as a new business in recent years. Unfortunately, the larger the market scale of domain lending service becomes, the riskier that there exist malicious behaviors or malwares hiding behind parked domains will be. Also, previous work for differentiating parked domain suffers two main defects: 1) too much data-collecting effort and CPU latency needed for features engineering and 2) ineffectiveness when detecting parked domains containing external links that are usually abused by hackers, e.g., drive-by download attack. Aiming for alleviating above defects without sacrificing practical usability, this paper proposes ParkedGuard as an efficient and accurate parked domain detector. Several scripting behavioral features were analyzed, while those with special statistical significance are adopted in ParkedGuard to make feature engineering much more cost-efficient. On the other hand, finding memberships between external links and parked domains was modeled as a graph mining problem, and a coarse-to-fine strategy was elaborately designed by leverage the graphical locality such that ParkedGuard outperforms the state-of-the-art in terms of both recall and precision rates.

Keywords: Coarse-to-fine strategy, domain parking service, graphical locality analysis, parked domain.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1243
70 Efficiency in Urban Governance towards Sustainability and Competitiveness of City : A Case Study of Kuala Lumpur

Authors: Hamzah Jusoh, Azmizam Abdul Rashid

Abstract:

Malaysia has successfully applied economic planning to guide the development of the country from an economy of agriculture and mining to a largely industrialised one. Now, with its sights set on attaining the economic level of a fully developed nation by 2020, the planning system must be made even more efficient and focused. It must ensure that every investment made in the country, contribute towards creating the desirable objective of a strong, modern, internationally competitive, technologically advanced, post-industrial economy. Cities in Malaysia must also be fully aware of the enormous competition it faces in a region with rapidly expanding and modernising economies, all contending for the same pool of potential international investments. Efficiency of urban governance is also fundamental issue in development characterized by sustainability, subsidiarity, equity, transparency and accountability, civic engagement and citizenship, and security. As described above, city competitiveness is harnessed through 'city marketing and city management'. High technology and high skilled industries, together with finance, transportation, tourism, business, information and professional services shopping and other commercial activities, are the principal components of the nation-s economy, which must be developed to a level well beyond where it is now. In this respect, Kuala Lumpur being the premier city must play the leading role.

Keywords: Economic planning, sustainability, efficiency, urban governance and city competitiveness.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2231
69 Spreading Japan's National Image through China during the Era of Mass Tourism: The Japan National Tourism Organization’s Use of Sina Weibo

Authors: Abigail Qian Zhou

Abstract:

Since China has entered an era of mass tourism, there has been a fundamental change in the way Chinese people approach and perceive the image of other countries. With the advent of the new media era, social networking sites such as Sina Weibo have become a tool for many foreign governmental organizations to spread and promote their national image. Among them, the Japan National Tourism Organization (JNTO) was one of the first foreign official tourism agencies to register with Sina Weibo and actively implement communication activities. Due to historical and political reasons, cognition of Japan's national image by the Chinese has always been complicated and contradictory. However, since 2015, China has become the largest source of tourists visiting Japan. This clearly indicates that the broadening of Japan's national image in China has been effective and has value worthy of reference in promoting a positive Chinese perception of Japan and encouraging Japanese tourism. Within this context and using the method of content analysis in media studies through content mining software, this study analyzed how JNTO’s Sina Weibo accounts have constructed and spread Japan's national image. This study also summarized the characteristics of its content and form, and finally revealed the strategy of JNTO in building its international image. The findings of this study not only add a tourism-based perspective to traditional national image communications research, but also provide some reference for the effective international dissemination of national image in the future.

Keywords: National image, tourism, international communication, Japan, China.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 973
68 Object Identification with Color, Texture, and Object-Correlation in CBIR System

Authors: Awais Adnan, Muhammad Nawaz, Sajid Anwar, Tamleek Ali, Muhammad Ali

Abstract:

Needs of an efficient information retrieval in recent years in increased more then ever because of the frequent use of digital information in our life. We see a lot of work in the area of textual information but in multimedia information, we cannot find much progress. In text based information, new technology of data mining and data marts are now in working that were started from the basic concept of database some where in 1960. In image search and especially in image identification, computerized system at very initial stages. Even in the area of image search we cannot see much progress as in the case of text based search techniques. One main reason for this is the wide spread roots of image search where many area like artificial intelligence, statistics, image processing, pattern recognition play their role. Even human psychology and perception and cultural diversity also have their share for the design of a good and efficient image recognition and retrieval system. A new object based search technique is presented in this paper where object in the image are identified on the basis of their geometrical shapes and other features like color and texture where object-co-relation augments this search process. To be more focused on objects identification, simple images are selected for the work to reduce the role of segmentation in overall process however same technique can also be applied for other images.

Keywords: Object correlation, Geometrical shape, Color, texture, features, contents.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2021