Search results for: data mining applications and discovery
30300 A Hybrid Approach for Thread Recommendation in MOOC Forums
Authors: Ahmad. A. Kardan, Amir Narimani, Foozhan Ataiefard
Abstract:
Recommender Systems have been developed to provide contents and services compatible to users based on their behaviors and interests. Due to information overload in online discussion forums and users diverse interests, recommending relative topics and threads is considered to be helpful for improving the ease of forum usage. In order to lead learners to find relevant information in educational forums, recommendations are even more needed. We present a hybrid thread recommender system for MOOC forums by applying social network analysis and association rule mining techniques. Initial results indicate that the proposed recommender system performs comparatively well with regard to limited available data from users' previous posts in the forum.Keywords: association rule mining, hybrid recommender system, massive open online courses, MOOCs, social network analysis
Procedia PDF Downloads 29630299 Detection of Important Biological Elements in Drug-Drug Interaction Occurrence
Authors: Reza Ferdousi, Reza Safdari, Yadollah Omidi
Abstract:
Drug-drug interactions (DDIs) are main cause of the adverse drug reactions and nature of the functional and molecular complexity of drugs behavior in human body make them hard to prevent and treat. With the aid of new technologies derived from mathematical and computational science the DDIs problems can be addressed with minimum cost and efforts. Market basket analysis is known as powerful method to identify co-occurrence of thing to discover patterns and frequency of the elements. In this research, we used market basket analysis to identify important bio-elements in DDIs occurrence. For this, we collected all known DDIs from DrugBank. The obtained data were analyzed by market basket analysis method. We investigated all drug-enzyme, drug-carrier, drug-transporter and drug-target associations. To determine the importance of the extracted bio-elements, extracted rules were evaluated in terms of confidence and support. Market basket analysis of the over 45,000 known DDIs reveals more than 300 important rules that can be used to identify DDIs, CYP 450 family were the most frequent shared bio-elements. We applied extracted rules over 2,000,000 unknown drug pairs that lead to discovery of more than 200,000 potential DDIs. Analysis of the underlying reason behind the DDI phenomena can help to predict and prevent DDI occurrence. Ranking of the extracted rules based on strangeness of them can be a supportive tool to predict the outcome of an unknown DDI.Keywords: drug-drug interaction, market basket analysis, rule discovery, important bio-elements
Procedia PDF Downloads 31430298 Syndromic Surveillance Framework Using Tweets Data Analytics
Authors: David Ming Liu, Benjamin Hirsch, Bashir Aden
Abstract:
Syndromic surveillance is to detect or predict disease outbreaks through the analysis of medical sources of data. Using social media data like tweets to do syndromic surveillance becomes more and more popular with the aid of open platform to collect data and the advantage of microblogging text and mobile geographic location features. In this paper, a Syndromic Surveillance Framework is presented with machine learning kernel using tweets data analytics. Influenza and the three cities Abu Dhabi, Al Ain and Dubai of United Arabic Emirates are used as the test disease and trial areas. Hospital cases data provided by the Health Authority of Abu Dhabi (HAAD) are used for the correlation purpose. In our model, Latent Dirichlet allocation (LDA) engine is adapted to do supervised learning classification and N-Fold cross validation confusion matrix are given as the simulation results with overall system recall 85.595% performance achieved.Keywords: Syndromic surveillance, Tweets, Machine Learning, data mining, Latent Dirichlet allocation (LDA), Influenza
Procedia PDF Downloads 11630297 Usability Issues of Smart Phone Applications: For Visually Challenged People
Authors: Anam Ashraf, Arif Raza
Abstract:
In this era of globalization, adoption of technology is quite difficult for people with physical disabilities compared to people with normal abilities. The advancement in mobile based accessible applications has opened up several different avenues for the visually challenged across the globe. Smartphones applications are not very common for blind people, but they access and use these applications in their daily lives to some extent. Several smartphone applications have a number of usability issues for the visually impaired. In this paper, we evaluate the usability of various android and iPhone applications for blind people through analysis and surveys. This paper aspires to provide guidance in order to increase smartphone application accessibility for the visually impaired. An abstract application design is also proposed to overcome usability issues in smartphone applications for visually challenged people.Keywords: eyes-free shell, human computer interaction, usability engineering, visually challenged
Procedia PDF Downloads 36530296 Geospatial Network Analysis Using Particle Swarm Optimization
Authors: Varun Singh, Mainak Bandyopadhyay, Maharana Pratap Singh
Abstract:
The shortest path (SP) problem concerns with finding the shortest path from a specific origin to a specified destination in a given network while minimizing the total cost associated with the path. This problem has widespread applications. Important applications of the SP problem include vehicle routing in transportation systems particularly in the field of in-vehicle Route Guidance System (RGS) and traffic assignment problem (in transportation planning). Well known applications of evolutionary methods like Genetic Algorithms (GA), Ant Colony Optimization, Particle Swarm Optimization (PSO) have come up to solve complex optimization problems to overcome the shortcomings of existing shortest path analysis methods. It has been reported by various researchers that PSO performs better than other evolutionary optimization algorithms in terms of success rate and solution quality. Further Geographic Information Systems (GIS) have emerged as key information systems for geospatial data analysis and visualization. This research paper is focused towards the application of PSO for solving the shortest path problem between multiple points of interest (POI) based on spatial data of Allahabad City and traffic speed data collected using GPS. Geovisualization of results of analysis is carried out in GIS.Keywords: particle swarm optimization, GIS, traffic data, outliers
Procedia PDF Downloads 48430295 A Review of Machine Learning for Big Data
Authors: Devatha Kalyan Kumar, Aravindraj D., Sadathulla A.
Abstract:
Big data are now rapidly expanding in all engineering and science and many other domains. The potential of large or massive data is undoubtedly significant, make sense to require new ways of thinking and learning techniques to address the various big data challenges. Machine learning is continuously unleashing its power in a wide range of applications. In this paper, the latest advances and advancements in the researches on machine learning for big data processing. First, the machine learning techniques methods in recent studies, such as deep learning, representation learning, transfer learning, active learning and distributed and parallel learning. Then focus on the challenges and possible solutions of machine learning for big data.Keywords: active learning, big data, deep learning, machine learning
Procedia PDF Downloads 44630294 PDDA: Priority-Based, Dynamic Data Aggregation Approach for Sensor-Based Big Data Framework
Authors: Lutful Karim, Mohammed S. Al-kahtani
Abstract:
Sensors are being used in various applications such as agriculture, health monitoring, air and water pollution monitoring, traffic monitoring and control and hence, play the vital role in the growth of big data. However, sensors collect redundant data. Thus, aggregating and filtering sensors data are significantly important to design an efficient big data framework. Current researches do not focus on aggregating and filtering data at multiple layers of sensor-based big data framework. Thus, this paper introduces (i) three layers data aggregation and framework for big data and (ii) a priority-based, dynamic data aggregation scheme (PDDA) for the lowest layer at sensors. Simulation results show that the PDDA outperforms existing tree and cluster-based data aggregation scheme in terms of overall network energy consumptions and end-to-end data transmission delay.Keywords: big data, clustering, tree topology, data aggregation, sensor networks
Procedia PDF Downloads 34730293 A Review on Big Data Movement with Different Approaches
Authors: Nay Myo Sandar
Abstract:
With the growth of technologies and applications, a large amount of data has been producing at increasing rate from various resources such as social media networks, sensor devices, and other information serving devices. This large collection of massive, complex and exponential growth of dataset is called big data. The traditional database systems cannot store and process such data due to large and complexity. Consequently, cloud computing is a potential solution for data storage and processing since it can provide a pool of resources for servers and storage. However, moving large amount of data to and from is a challenging issue since it can encounter a high latency due to large data size. With respect to big data movement problem, this paper reviews the literature of previous works, discusses about research issues, finds out approaches for dealing with big data movement problem.Keywords: Big Data, Cloud Computing, Big Data Movement, Network Techniques
Procedia PDF Downloads 8830292 Mining in Peru and Local Governance: Assessing the Contribution of CRS Projects
Authors: Sandra Carrillo Hoyos
Abstract:
Mining activities in South America have significantly grown during the last decades, given the abundance of natural resources, the implemented governmental policies to incentivize foreign investment as well as the boom in international prices for metals and oil between 2002 and 2008. While this context allowed the region to occupy a leading position between the top producers of minerals around the world, it has also meant an increase in socio-environmental conflicts which have generated costs and negative impacts not only for the companies but especially for the governments and local communities.During the latest decade, the mining sector in Peru has faced with the social resistance of a large number of communities, which began organizing actions against the implementation of high investing projects. The dissatisfaction has derived in the prevalence of socio-environmental conflicts associated with mining activities, some of them never solved into an agreement. In order to prevent those socio-environmental conflicts and obtain the social license from local communities, most of the mining companies have developed diverse initiatives within the framework of policies and practices of corporate social responsibility (CSR). This paper has assessed the mining sector’s contribution toward the local development management along the last decade, as part of CSR strategies as well as the policies promoted by the Peruvian State. This assessment found that, in the beginning, these initiatives have been based on a philanthropic approach and were reacting to pressures from local stakeholders to maintain the consent to operate from the surrounding communities as well as to create, as a result, a harmonious atmosphere for operations. Due to the weak State presence, such practices have increased the expectations of communities related to the participation of mining companies in solving structural development problems, especially those related to primary needs, infrastructure, education, health, among others. In other words, this paper was focused on analyze in what extent these initiatives have promoted local empowerment for development planning and integrated management of natural resources from a territorial approach. From this perspective, the analysis demonstrates that, while the design and planning of social investment initiatives have improved due to the sector´s sustainability approach, many companies have developed actions beyond their competence during this process. In some cases, the referenced actions have generated dependency with communities, even though this relationship has not exempted the companies of conflict situations with unfortunate consequences. Furthermore, the social programs developed have not necessarily generated a significant impact in improving the quality of life of affected populations. In fact, it is possible to identify that those regions with high mining resources and investment are facing with a situation of poverty and high dependency on mining production. In spite of the revenues derived from mining industry, local governments have not been able to translate the royalties into sustainable development opportunities. For this reason, the proposed paper suggests some challenges for the mining sector contribution to local development based on the best practices and lessons learnt from a benchmarking for the leading mining companies.Keywords: corporate social responsibility, local development, mining, socio-environmental conflict
Procedia PDF Downloads 40830291 Exploring the Applications of Modular Forms in Cryptography
Authors: Berhane Tewelday Weldhiwot
Abstract:
This research investigates the pivotal role of modular forms in modern cryptographic systems, particularly focusing on their applications in secure communications and data integrity. Modular forms, which are complex analytic functions with rich arithmetic properties, have gained prominence due to their connections to number theory and algebraic geometry. This study begins by outlining the fundamental concepts of modular forms and their historical development, followed by a detailed examination of their applications in cryptographic protocols such as elliptic curve cryptography and zero-knowledge proofs. By employing techniques from analytic number theory, the research delves into how modular forms can enhance the efficiency and security of cryptographic algorithms. The findings suggest that leveraging modular forms not only improves computational performance but also fortifies security measures against emerging threats in digital communication. This work aims to contribute to the ongoing discourse on integrating advanced mathematical theories into practical applications, ultimately fostering innovation in cryptographic methodologies.Keywords: modular forms, cryptography, elliptic curves, applications, mathematical theory
Procedia PDF Downloads 2330290 Expanding Trading Strategies By Studying Sentiment Correlation With Data Mining Techniques
Authors: Ved Kulkarni, Karthik Kini
Abstract:
This experiment aims to understand how the media affects the power markets in the mainland United States and study the duration of reaction time between news updates and actual price movements. it have taken into account electric utility companies trading in the NYSE and excluded companies that are more politically involved and move with higher sensitivity to Politics. The scrapper checks for any news related to keywords, which are predefined and stored for each specific company. Based on this, the classifier will allocate the effect into five categories: positive, negative, highly optimistic, highly negative, or neutral. The effect on the respective price movement will be studied to understand the response time. Based on the response time observed, neural networks would be trained to understand and react to changing market conditions, achieving the best strategy in every market. The stock trader would be day trading in the first phase and making option strategy predictions based on the black holes model. The expected result is to create an AI-based system that adjusts trading strategies within the market response time to each price movement.Keywords: data mining, language processing, artificial neural networks, sentiment analysis
Procedia PDF Downloads 2030289 An Enhanced Connectivity Aware Routing Protocol for Vehicular Ad Hoc Networks
Authors: Ahmadu Maidorawa, Kamalrulnizam Abu Bakar
Abstract:
This paper proposed an Enhanced Connectivity Aware Routing (ECAR) protocol for Vehicular Ad hoc Network (VANET). The protocol uses a control broadcast to reduce the number of overhead packets needed in a route discovery process. It is also equipped with an alternative backup route that is used whenever a primary path to destination failed, which highly reduces the frequent launching and re-launching of the route discovery process that waste useful bandwidth and unnecessarily prolonging the average packet delay. NS2 simulation results show that the performance of ECAR protocol outperformed the original connectivity aware routing (CAR) protocol by reducing the average packet delay by 28%, control overheads by 27% and increased the packet delivery ratio by 22%.Keywords: alternative path, primary path, protocol, routing, VANET, vehicular ad hoc networks
Procedia PDF Downloads 40430288 Investigation of Yard Seam Workings for the Proposed Newcastle Light Rail Project
Authors: David L. Knott, Robert Kingsland, Alistair Hitchon
Abstract:
The proposed Newcastle Light Rail is a key part of the revitalisation of Newcastle, NSW and will provide a frequent and reliable travel option throughout the city centre, running from Newcastle Interchange at Wickham to Pacific Park in Newcastle East, a total of 2.7 kilometers in length. Approximately one-third of the route, along Hunter and Scott Streets, is subject to potential shallow underground mine workings. The extent of mining and seams mined is unclear. Convicts mined the Yard Seam and overlying Dudley (Dirty) Seam in Newcastle sometime between 1800 and 1830. The Australian Agricultural Company mined the Yard Seam from about 1831 to the 1860s in the alignment area. The Yard Seam was about 3 feet (0.9m) thick, and therefore, known as the Yard Seam. Mine maps do not exist for the workings in the area of interest and it was unclear if both or just one seam was mined. Information from 1830s geological mapping and other data showing shaft locations were used along Scott Street and information from the 1908 Royal Commission was used along Hunter Street to develop an investigation program. In addition, mining was encountered for several sites to the south of the alignment at depths of about 7 m to 25 m. Based on the anticipated depths of mining, it was considered prudent to assess the potential for sinkhole development on the proposed alignment and realigned underground utilities and to obtain approval for the work from Subsidence Advisory NSW (SA NSW). The assessment consisted of a desktop study, followed by a subsurface investigation. Four boreholes were drilled along Scott Street and three boreholes were drilled along Hunter Street using HQ coring techniques in the rock. The placement of boreholes was complicated by the presence of utilities in the roadway and traffic constraints. All the boreholes encountered the Yard Seam, with conditions varying from unmined coal to an open void, indicating the presence of mining. The geotechnical information obtained from the boreholes was expanded by using various downhole techniques including; borehole camera, borehole sonar, and downhole geophysical logging. The camera provided views of the rock and helped to explain zones of no recovery. In addition, timber props within the void were observed. Borehole sonar was performed in the void and provided an indication of room size as well as the presence of timber props within the room. Downhole geophysical logging was performed in the boreholes to measure density, natural gamma, and borehole deviation. The data helped confirm that all the mining was in the Yard Seam and that the overlying Dudley Seam had been eroded in the past over much of the alignment. In summary, the assessment allowed the potential for sinkhole subsidence to be assessed and a mitigation approach developed to allow conditional approval by SA NSW. It also confirmed the presence of mining in the Yard Seam, the depth to the seam and mining conditions, and indicated that subsidence did not appear to have occurred in the past.Keywords: downhole investigation techniques, drilling, mine subsidence, yard seam
Procedia PDF Downloads 31430287 The Prospects of Leveraging (Big) Data for Accelerating a Just Sustainable Transition around Different Contexts
Authors: Sombol Mokhles
Abstract:
This paper tries to show the prospects of utilising (big)data for enabling just the transition of diverse cities. Our key purpose is to offer a framework of applications and implications of utlising (big) data in comparing sustainability transitions across different cities. Relying on the cosmopolitan comparison, this paper explains the potential application of (big) data but also its limitations. The paper calls for adopting a data-driven and just perspective in including different cities around the world. Having a just and inclusive approach at the front and centre ensures a just transition with synergistic effects that leave nobody behind.Keywords: big data, just sustainable transition, cosmopolitan city comparison, cities
Procedia PDF Downloads 9930286 Lead and Cadmium Spatial Pattern and Risk Assessment around Coal Mine in Hyrcanian Forest, North Iran
Authors: Mahsa Tavakoli, Seyed Mohammad Hojjati, Yahya Kooch
Abstract:
In this study, the effect of coal mining activities on lead and cadmium concentrations and distribution in soil was investigated in Hyrcanian forest, North Iran. 16 plots (20×20 m2) were established by systematic-randomly (60×60 m2) in an area of 4 ha (200×200 m2-mine entrance placed at center). An area adjacent to the mine was not affected by the mining activity; considered as the controlled area. In order to investigate soil lead and cadmium concentration, one sample was taken from the 0-10 cm in each plot. To study the spatial pattern of soil properties and lead and cadmium concentrations in the mining area, an area of 80×80m2 (the mine as the center) was considered and 80 soil samples were systematic-randomly taken (10 m intervals). Geostatistical analysis was performed via Kriging method and GS+ software (version 5.1). In order to estimate the impact of coal mining activities on soil quality, pollution index was measured. Lead and cadmium concentrations were significantly higher in mine area (Pb: 10.97±0.30, Cd: 184.47±6.26 mg.kg-1) in comparison to control area (Pb: 9.42±0.17, Cd: 131.71±15.77 mg.kg-1). The mean values of the PI index indicate that Pb (1.16) and Cd (1.77) presented slightly polluted. Results of the NIPI index showed that Pb (1.44) and Cd (2.52) presented slight pollution and moderate pollution respectively. Results of variography and kriging method showed that it is possible to prepare interpolation maps of lead and cadmium around the mining areas in Hyrcanian forest. According to results of pollution and risk assessments, forest soil was contaminated by heavy metals (lead and cadmium); therefore, using reclamation and remediation techniques in these areas is necessary.Keywords: traditional coal mining, heavy metals, pollution indicators, geostatistics, Caspian forest
Procedia PDF Downloads 18030285 Mining User-Generated Contents to Detect Service Failures with Topic Model
Authors: Kyung Bae Park, Sung Ho Ha
Abstract:
Online user-generated contents (UGC) significantly change the way customers behave (e.g., shop, travel), and a pressing need to handle the overwhelmingly plethora amount of various UGC is one of the paramount issues for management. However, a current approach (e.g., sentiment analysis) is often ineffective for leveraging textual information to detect the problems or issues that a certain management suffers from. In this paper, we employ text mining of Latent Dirichlet Allocation (LDA) on a popular online review site dedicated to complaint from users. We find that the employed LDA efficiently detects customer complaints, and a further inspection with the visualization technique is effective to categorize the problems or issues. As such, management can identify the issues at stake and prioritize them accordingly in a timely manner given the limited amount of resources. The findings provide managerial insights into how analytics on social media can help maintain and improve their reputation management. Our interdisciplinary approach also highlights several insights by applying machine learning techniques in marketing research domain. On a broader technical note, this paper illustrates the details of how to implement LDA in R program from a beginning (data collection in R) to an end (LDA analysis in R) since the instruction is still largely undocumented. In this regard, it will help lower the boundary for interdisciplinary researcher to conduct related research.Keywords: latent dirichlet allocation, R program, text mining, topic model, user generated contents, visualization
Procedia PDF Downloads 18730284 Statistical Models and Time Series Forecasting on Crime Data in Nepal
Authors: Dila Ram Bhandari
Abstract:
Throughout the 20th century, new governments were created where identities such as ethnic, religious, linguistic, caste, communal, tribal, and others played a part in the development of constitutions and the legal system of victim and criminal justice. Acute issues with extremism, poverty, environmental degradation, cybercrimes, human rights violations, crime against, and victimization of both individuals and groups have recently plagued South Asian nations. Everyday massive number of crimes are steadfast, these frequent crimes have made the lives of common citizens restless. Crimes are one of the major threats to society and also for civilization. Crime is a bone of contention that can create a societal disturbance. The old-style crime solving practices are unable to live up to the requirement of existing crime situations. Crime analysis is one of the most important activities of the majority of intelligent and law enforcement organizations all over the world. The South Asia region lacks such a regional coordination mechanism, unlike central Asia of Asia Pacific regions, to facilitate criminal intelligence sharing and operational coordination related to organized crime, including illicit drug trafficking and money laundering. There have been numerous conversations in recent years about using data mining technology to combat crime and terrorism. The Data Detective program from Sentient as a software company, uses data mining techniques to support the police (Sentient, 2017). The goals of this internship are to test out several predictive model solutions and choose the most effective and promising one. First, extensive literature reviews on data mining, crime analysis, and crime data mining were conducted. Sentient offered a 7-year archive of crime statistics that were daily aggregated to produce a univariate dataset. Moreover, a daily incidence type aggregation was performed to produce a multivariate dataset. Each solution's forecast period lasted seven days. Statistical models and neural network models were the two main groups into which the experiments were split. For the crime data, neural networks fared better than statistical models. This study gives a general review of the applied statistics and neural network models. A detailed image of each model's performance on the available data and generalizability is provided by a comparative analysis of all the models on a comparable dataset. Obviously, the studies demonstrated that, in comparison to other models, Gated Recurrent Units (GRU) produced greater prediction. The crime records of 2005-2019 which was collected from Nepal Police headquarter and analysed by R programming. In conclusion, gated recurrent unit implementation could give benefit to police in predicting crime. Hence, time series analysis using GRU could be a prospective additional feature in Data Detective.Keywords: time series analysis, forecasting, ARIMA, machine learning
Procedia PDF Downloads 16630283 Point Estimation for the Type II Generalized Logistic Distribution Based on Progressively Censored Data
Authors: Rana Rimawi, Ayman Baklizi
Abstract:
Skewed distributions are important models that are frequently used in applications. Generalized distributions form a class of skewed distributions and gain widespread use in applications because of their flexibility in data analysis. More specifically, the Generalized Logistic Distribution with its different types has received considerable attention recently. In this study, based on progressively type-II censored data, we will consider point estimation in type II Generalized Logistic Distribution (Type II GLD). We will develop several estimators for its unknown parameters, including maximum likelihood estimators (MLE), Bayes estimators and linear estimators (BLUE). The estimators will be compared using simulation based on the criteria of bias and Mean square error (MSE). An illustrative example of a real data set will be given.Keywords: point estimation, type II generalized logistic distribution, progressive censoring, maximum likelihood estimation
Procedia PDF Downloads 20030282 Research of the Three-Dimensional Visualization Geological Modeling of Mine Based on Surpac
Authors: Honggang Qu, Yong Xu, Rongmei Liu, Zhenji Gao, Bin Wang
Abstract:
Today's mining industry is advancing gradually toward digital and visual direction. The three-dimensional visualization geological modeling of mine is the digital characterization of mineral deposits and is one of the key technology of digital mining. Three-dimensional geological modeling is a technology that combines geological spatial information management, geological interpretation, geological spatial analysis and prediction, geostatistical analysis, entity content analysis and graphic visualization in a three-dimensional environment with computer technology and is used in geological analysis. In this paper, the three-dimensional geological modeling of an iron mine through the use of Surpac is constructed, and the weight difference of the estimation methods between the distance power inverse ratio method and ordinary kriging is studied, and the ore body volume and reserves are simulated and calculated by using these two methods. Compared with the actual mine reserves, its result is relatively accurate, so it provides scientific bases for mine resource assessment, reserve calculation, mining design and so on.Keywords: three-dimensional geological modeling, geological database, geostatistics, block model
Procedia PDF Downloads 8030281 Saving Energy at a Wastewater Treatment Plant through Electrical and Production Data Analysis
Authors: Adriano Araujo Carvalho, Arturo Alatrista Corrales
Abstract:
This paper intends to show how electrical energy consumption and production data analysis were used to find opportunities to save energy at Taboada wastewater treatment plant in Callao, Peru. In order to access the data, it was used independent data networks for both electrical and process instruments, which were taken to analyze under an ISO 50001 energy audit, which considered, thus, Energy Performance Indexes for each process and a step-by-step guide presented in this text. Due to the use of aforementioned methodology and data mining techniques applied on information gathered through electronic multimeters (conveniently placed on substation switchboards connected to a cloud network), it was possible to identify thoroughly the performance of each process and thus, evidence saving opportunities which were previously hidden before. The data analysis brought both costs and energy reduction, allowing the plant to save significant resources and to be certified under ISO 50001.Keywords: energy and production data analysis, energy management, ISO 50001, wastewater treatment plant energy analysis
Procedia PDF Downloads 19730280 Interoperable Platform for Internet of Things at Home Applications
Authors: Fabiano Amorim Vaz, Camila Gonzaga de Araujo
Abstract:
With the growing number of personal devices such as smartphones, tablets, smart watches, among others, in addition to recent devices designed for IoT, it is observed that residential environment has potential to generate important information about our daily lives. Therefore, this work is focused on showing and evaluating a system that integrates all these technologies considering the context of a smart house. To achieve this, we define an architecture capable of supporting the amount of data generated and consumed at a residence and, mainly, the variety of this data presents. We organize it in a particular cloud containing information about robots, recreational vehicles, weather, in addition to data from the house, such as lighting, energy, security, among others. The proposed architecture can be extrapolated to various scenarios and applications. Through the core of this work, we can define new functionality for residences integrating them with more resources.Keywords: cloud computing, IoT, robotics, smart house
Procedia PDF Downloads 38230279 A Survey on Data-Centric and Data-Aware Techniques for Large Scale Infrastructures
Authors: Silvina Caíno-Lores, Jesús Carretero
Abstract:
Large scale computing infrastructures have been widely developed with the core objective of providing a suitable platform for high-performance and high-throughput computing. These systems are designed to support resource-intensive and complex applications, which can be found in many scientific and industrial areas. Currently, large scale data-intensive applications are hindered by the high latencies that result from the access to vastly distributed data. Recent works have suggested that improving data locality is key to move towards exascale infrastructures efficiently, as solutions to this problem aim to reduce the bandwidth consumed in data transfers, and the overheads that arise from them. There are several techniques that attempt to move computations closer to the data. In this survey we analyse the different mechanisms that have been proposed to provide data locality for large scale high-performance and high-throughput systems. This survey intends to assist scientific computing community in understanding the various technical aspects and strategies that have been reported in recent literature regarding data locality. As a result, we present an overview of locality-oriented techniques, which are grouped in four main categories: application development, task scheduling, in-memory computing and storage platforms. Finally, the authors include a discussion on future research lines and synergies among the former techniques.Keywords: data locality, data-centric computing, large scale infrastructures, cloud computing
Procedia PDF Downloads 26030278 Extracting Opinions from Big Data of Indonesian Customer Reviews Using Hadoop MapReduce
Authors: Veronica S. Moertini, Vinsensius Kevin, Gede Karya
Abstract:
Customer reviews have been collected by many kinds of e-commerce websites selling products, services, hotel rooms, tickets and so on. Each website collects its own customer reviews. The reviews can be crawled, collected from those websites and stored as big data. Text analysis techniques can be used to analyze that data to produce summarized information, such as customer opinions. Then, these opinions can be published by independent service provider websites and used to help customers in choosing the most suitable products or services. As the opinions are analyzed from big data of reviews originated from many websites, it is expected that the results are more trusted and accurate. Indonesian customers write reviews in Indonesian language, which comes with its own structures and uniqueness. We found that most of the reviews are expressed with “daily language”, which is informal, do not follow the correct grammar, have many abbreviations and slangs or non-formal words. Hadoop is an emerging platform aimed for storing and analyzing big data in distributed systems. A Hadoop cluster consists of master and slave nodes/computers operated in a network. Hadoop comes with distributed file system (HDFS) and MapReduce framework for supporting parallel computation. However, MapReduce has weakness (i.e. inefficient) for iterative computations, specifically, the cost of reading/writing data (I/O cost) is high. Given this fact, we conclude that MapReduce function is best adapted for “one-pass” computation. In this research, we develop an efficient technique for extracting or mining opinions from big data of Indonesian reviews, which is based on MapReduce with one-pass computation. In designing the algorithm, we avoid iterative computation and instead adopt a “look up table” technique. The stages of the proposed technique are: (1) Crawling the data reviews from websites; (2) cleaning and finding root words from the raw reviews; (3) computing the frequency of the meaningful opinion words; (4) analyzing customers sentiments towards defined objects. The experiments for evaluating the performance of the technique were conducted on a Hadoop cluster with 14 slave nodes. The results show that the proposed technique (stage 2 to 4) discovers useful opinions, is capable of processing big data efficiently and scalable.Keywords: big data analysis, Hadoop MapReduce, analyzing text data, mining Indonesian reviews
Procedia PDF Downloads 20130277 Directional Dust Deposition Measurements: The Influence of Seasonal Changes and the Meteorological Conditions Influencing in Witbank Area and Carletonville Area
Authors: Maphuti Georgina Kwata
Abstract:
Coal mining in Mpumalanga Province is known of contributing to the atmospheric pollution from various activities. Gold mining in North-West Province is known of also contributing to the atmospheric pollution especially with the production of radon gas. In this research directional dust deposition gauge was used to measure source of direction and meteorological data was used to determine the wind rose blowing and the influence of the seasonal changes. Fourteen months of dust collection was undertaken in Witbank Area and Carletonville Area. The results shows that the sources of direction for Ericson Dam its East in February 2010 and Tip Area shows that the source of direction its West in October 2010. In the East direction there were mining operations, power stations which contributed to the East to be the sources of direction. In the West direction there were smelters, power stations and agricultural activities which contributed for the source of direction to be the West direction for Driefontein Mine: East Recreational Village Club. The East of Leslie Williams hospital is the source of direction which also indicated that there dust generating activities such as mining operation, agricultural activities. The meteorological results for Emalahleni Area in summer and winter the wind rose blow with wind speed of 5-10 ms-1 from the East sector. Annual average for the wind rose blow its East South eastern sector with 20 ms-1 and day time the wind rose from northwestern sector with excess of 20 ms-1. The night time wind direction East-eastern direction with a maximum wind speed of 20 ms-1. The meteorogical results for Driefontein Mine show that North-western sector and north-eastern sector wind rose is blowing with 5-10 ms-1 win speed. Day time wind blows from the West sector and night time wind blows from the north sector. In summer the wind blows North-east sector with 5-10 ms-1 and winter wind blows from North-west and it’s also predominant. In spring wind blows from north-east. The conclusion is that not only mining operation where the directional dust deposit gauge were installed contributed to the source of direction also the power stations, smelters, and other activities nearby the mining operation contributed. The recommendations are the dust suppressant for unpaved roads should be used on a regular basis and there should be monitoring of the weather conditions (the wind speed and direction prior to blasting to ensure minimal emissions).Keywords: directional dust deposition gauge, BS part 5 1747 dust deposit gauge, wind rose, wind blowing
Procedia PDF Downloads 50630276 Survey on Big Data Stream Classification by Decision Tree
Authors: Mansoureh Ghiasabadi Farahani, Samira Kalantary, Sara Taghi-Pour, Mahboubeh Shamsi
Abstract:
Nowadays, the development of computers technology and its recent applications provide access to new types of data, which have not been considered by the traditional data analysts. Two particularly interesting characteristics of such data sets include their huge size and streaming nature .Incremental learning techniques have been used extensively to address the data stream classification problem. This paper presents a concise survey on the obstacles and the requirements issues classifying data streams with using decision tree. The most important issue is to maintain a balance between accuracy and efficiency, the algorithm should provide good classification performance with a reasonable time response.Keywords: big data, data streams, classification, decision tree
Procedia PDF Downloads 52230275 Code Embedding for Software Vulnerability Discovery Based on Semantic Information
Authors: Joseph Gear, Yue Xu, Ernest Foo, Praveen Gauravaran, Zahra Jadidi, Leonie Simpson
Abstract:
Deep learning methods have been seeing an increasing application to the long-standing security research goal of automatic vulnerability detection for source code. Attention, however, must still be paid to the task of producing vector representations for source code (code embeddings) as input for these deep learning models. Graphical representations of code, most predominantly Abstract Syntax Trees and Code Property Graphs, have received some use in this task of late; however, for very large graphs representing very large code snip- pets, learning becomes prohibitively computationally expensive. This expense may be reduced by intelligently pruning this input to only vulnerability-relevant information; however, little research in this area has been performed. Additionally, most existing work comprehends code based solely on the structure of the graph at the expense of the information contained by the node in the graph. This paper proposes Semantic-enhanced Code Embedding for Vulnerability Discovery (SCEVD), a deep learning model which uses semantic-based feature selection for its vulnerability classification model. It uses information from the nodes as well as the structure of the code graph in order to select features which are most indicative of the presence or absence of vulnerabilities. This model is implemented and experimentally tested using the SARD Juliet vulnerability test suite to determine its efficacy. It is able to improve on existing code graph feature selection methods, as demonstrated by its improved ability to discover vulnerabilities.Keywords: code representation, deep learning, source code semantics, vulnerability discovery
Procedia PDF Downloads 16130274 The Beta-Fisher Snedecor Distribution with Applications to Cancer Remission Data
Authors: K. A. Adepoju, O. I. Shittu, A. U. Chukwu
Abstract:
In this paper, a new four-parameter generalized version of the Fisher Snedecor distribution called Beta- F distribution is introduced. The comprehensive account of the statistical properties of the new distributions was considered. Formal expressions for the cumulative density function, moments, moment generating function and maximum likelihood estimation, as well as its Fisher information, were obtained. The flexibility of this distribution as well as its robustness using cancer remission time data was demonstrated. The new distribution can be used in most applications where the assumption underlying the use of other lifetime distributions is violated.Keywords: fisher-snedecor distribution, beta-f distribution, outlier, maximum likelihood method
Procedia PDF Downloads 34830273 Proteomic Evaluation of Sex Differences in the Plasma of Non-human Primates Exposed to Ionizing Radiation for Biomarker Discovery
Authors: Christina Williams, Mehari Weldemariam, Ann M. Farese, Thomas J. MacVittie, Maureen A. Kane
Abstract:
Radiation exposure results in dose-dependent and time-dependent multi-organ damage. Drug development of medical countermeasures (MCM) for radiation-induced injury occurs under the FDA Animal Rule because human efficacy studies are not ethical or feasible. The FDA Animal Rule requires the representation of both sexes and describes several uses for biomarkers in MCM drug development studies. Currently, MCMs are limited and there is no FDA-approved biomarker for any radiation injury. Sex as a variable is essential to identifying biomarkers and developing effective MCMs for acute radiation exposure (ARS) and delayed effects of acute radiation exposure (DEARE). These studies aim to address the death of information on sex differences that have not been determined by studies that included only male, single-sex cohorts. Studies have reported differences in radiosensitivity according to sex. As such, biomarker discovery for radiation-induced damage must consider sex as a variable. This study evaluated the plasma proteomic profile of Rhesus macaque non-human primates after different exposures and doses, as well as time points after radiation. Exposures and doses included total body irradiation between 5-7.5 Gy and partial body irradiation with 5% bone marrow sparing at 9, 9.5 and 10 Gy. Timepoints after irradiation included days 1, 3, 60, and 180, which encompassed both acute radiation syndromes and delayed effects of acute radiation exposure. Bottom-up proteomic analyses of plasma included equal numbers of males and females. In the control animals, few proteomic differences are observed between the sexes. In the irradiated animals, there are a few sex differences, with changes mostly consisting of proteins upregulated in the female animals. Multiple canonical pathways were upregulated in irradiated animals relative to the control animals when subjected to pathway analysis, but differential responses between the sexes are limited. These data provide critical baseline differences according to sex and establish sex differences in non-human primate models relevant to drug development of MCM under the FDA Animal Rule.Keywords: ionizing radiation, sex differences, plasma proteomics, biomarker discovery
Procedia PDF Downloads 9130272 Efficient Subsurface Mapping: Automatic Integration of Ground Penetrating Radar with Geographic Information Systems
Authors: Rauf R. Hussein, Devon M. Ramey
Abstract:
Integrating Ground Penetrating Radar (GPR) with Geographic Information Systems (GIS) can provide valuable insights for various applications, such as archaeology, transportation, and utility locating. Although there has been progress toward automating the integration of GPR data with GIS, fully automatic integration has not been achieved yet. Additionally, manually integrating GPR data with GIS can be a time-consuming and error-prone process. In this study, actual, real-world GPR applications are presented, and a software named GPR-GIS 10 is created to interactively extract subsurface targets from GPR radargrams and automatically integrate them into GIS. With this software, it is possible to quickly and reliably integrate the two techniques to create informative subsurface maps. The results indicated that automatic integration of GPR with GIS can be an efficient tool to map and view any subsurface targets in their appropriate location in a 3D space with the needed precision. The findings of this study could help GPR-GIS integrators save time and reduce errors in many GPR-GIS applications.Keywords: GPR, GIS, GPR-GIS 10, drone technology, automation
Procedia PDF Downloads 9230271 Secure Intelligent Information Management by Using a Framework of Virtual Phones-On Cloud Computation
Authors: Mohammad Hadi Khorashadi Zadeh
Abstract:
Many new applications and internet services have been emerged since the innovation of mobile networks and devices. However, these applications have problems of security, management, and performance in business environments. Cloud systems provide information transfer, management facilities, and security for virtual environments. Therefore, an innovative internet service and a business model are proposed in the present study for creating a secure and consolidated environment for managing the mobile information of organizations based on cloud virtual phones (CVP) infrastructures. Using this method, users can run Android and web applications in the cloud which enhance performance by connecting to other CVP users and increases privacy. It is possible to combine the CVP with distributed protocols and central control which mimics the behavior of human societies. This mix helps in dealing with sensitive data in mobile devices and facilitates data management with less application overhead.Keywords: BYOD, mobile cloud computing, mobile security, information management
Procedia PDF Downloads 319