Search results for: big data visualization.
6158 Ontology-Based Approach for Temporal Semantic Modeling of Social Networks
Authors: Souâad Boudebza, Omar Nouali, Faiçal Azouaou
Abstract:
Social networks have recently gained a growing interest on the web. Traditional formalisms for representing social networks are static and suffer from the lack of semantics. In this paper, we will show how semantic web technologies can be used to model social data. The SemTemp ontology aligns and extends existing ontologies such as FOAF, SIOC, SKOS and OWL-Time to provide a temporal and semantically rich description of social data. We also present a modeling scenario to illustrate how our ontology can be used to model social networks.Keywords: Ontology, semantic web, social network, temporal modeling.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 15546157 Developing NAND Flash-Memory SSD-Based File System Design
Authors: Jaechun No
Abstract:
This paper focuses on I/O optimizations of N-hybrid (New-Form of hybrid), which provides a hybrid file system space constructed on SSD and HDD. Although the promising potentials of SSD, such as the absence of mechanical moving overhead and high random I/O throughput, have drawn a lot of attentions from IT enterprises, its high ratio of cost/capacity makes it less desirable to build a large-scale data storage subsystem composed of only SSDs. In this paper, we present N-hybrid that attempts to integrate the strengths of SSD and HDD, to offer a single, large hybrid file system space. Several experiments were conducted to verify the performance of N-hybrid.Keywords: SSD, data section, I/O optimizations.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 18036156 Data Rate Based Grouping Scheme for Cooperative Communications in Wireless LANs
Authors: Sunmyeng Kim
Abstract:
IEEE 802.11a/b/g standards provide multiple transmission rates, which can be changed dynamically according to the channel condition. Cooperative communications were introduced to improve the overall performance of wireless LANs with the help of relay nodes with higher transmission rates. The cooperative communications are based on the fact that the transmission is much faster when sending data packets to a destination node through a relay node with higher transmission rate, rather than sending data directly to the destination node at low transmission rate. To apply the cooperative communications in wireless LAN, several MAC protocols have been proposed. Some of them can result in collisions among relay nodes in a dense network. In order to solve this problem, we propose a new protocol. Relay nodes are grouped based on their transmission rates. And then, relay nodes only in the highest group try to get channel access. Performance evaluation is conducted using simulation, and shows that the proposed protocol significantly outperforms the previous protocol in terms of throughput and collision probability.
Keywords: Cooperative communications, MAC protocol, relay node, WLAN.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 19066155 Improvement in Power Transformer Intelligent Dissolved Gas Analysis Method
Authors: S. Qaedi, S. Seyedtabaii
Abstract:
Non-Destructive evaluation of in-service power transformer condition is necessary for avoiding catastrophic failures. Dissolved Gas Analysis (DGA) is one of the important methods. Traditional, statistical and intelligent DGA approaches have been adopted for accurate classification of incipient fault sources. Unfortunately, there are not often enough faulty patterns required for sufficient training of intelligent systems. By bootstrapping the shortcoming is expected to be alleviated and algorithms with better classification success rates to be obtained. In this paper the performance of an artificial neural network, K-Nearest Neighbour and support vector machine methods using bootstrapped data are detailed and shown that while the success rate of the ANN algorithms improves remarkably, the outcome of the others do not benefit so much from the provided enlarged data space. For assessment, two databases are employed: IEC TC10 and a dataset collected from reported data in papers. High average test success rate well exhibits the remarkable outcome.Keywords: Dissolved gas analysis, Transformer incipient fault, Artificial Neural Network, Support Vector Machine (SVM), KNearest Neighbor (KNN)
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 27396154 An Improved Transmission Scheme in Cooperative Communication System
Authors: Seung-Jun Yu, Young-Min Ko, Hyoung-Kyu Song
Abstract:
Recently developed cooperative diversity scheme enables a terminal to get transmit diversity through the support of other terminals. However, most of the introduced cooperative schemes have a common fault of decreased transmission rate because the destination should receive the decodable compositions of symbols from the source and the relay. In order to achieve high data rate, we propose a cooperative scheme that employs hierarchical modulation. This scheme is free from the rate loss and allows seamless cooperative communication.Keywords: Cooperative communication, hierarchical modulation, high data rate, transmission scheme.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 18906153 Automated Method Time Measurement System for Redesigning Dynamic Facility Layout
Authors: Salam Alzubaidi, G. Fantoni, F. Failli, M. Frosolini
Abstract:
The dynamic facility layout problem is a really critical issue in the competitive industrial market; thus, solving this problem requires robust design and effective simulation systems. The sustainable simulation requires inputting reliable and accurate data into the system. So this paper describes an automated system integrated into the real environment to measure the duration of the material handling operations, collect the data in real-time, and determine the variances between the actual and estimated time schedule of the operations in order to update the simulation software and redesign the facility layout periodically. The automated method- time measurement system collects the real data through using Radio Frequency-Identification (RFID) and Internet of Things (IoT) technologies. Hence, attaching RFID- antenna reader and RFID tags enables the system to identify the location of the objects and gathering the time data. The real duration gathered will be manipulated by calculating the moving average duration of the material handling operations, choosing the shortest material handling path, and then updating the simulation software to redesign the facility layout accommodating with the shortest/real operation schedule. The periodic simulation in real-time is more sustainable and reliable than the simulation system relying on an analysis of historical data. The case study of this methodology is in cooperation with a workshop team for producing mechanical parts. Although there are some technical limitations, this methodology is promising, and it can be significantly useful in the redesigning of the manufacturing layout.
Keywords: Dynamic facility layout problem, internet of things, method time measurement, radio frequency identification, simulation.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 5996152 Main Cause of Children's Deaths in Indigenous Wayuu Community from Department of La Guajira: A Research Developed through Data Mining Use
Authors: Isaura Esther Solano Núñez, David Suarez
Abstract:
The main purpose of this research is to discover what causes death in children of the Wayuu community, and deeply analyze those results in order to take corrective measures to properly control infant mortality. We consider important to determine the reasons that are producing early death in this specific type of population, since they are the most vulnerable to high risk environmental conditions. In this way, the government, through competent authorities, may develop prevention policies and the right measures to avoid an increase of this tragic fact. The methodology used to develop this investigation is data mining, which consists in gaining and examining large amounts of data to produce new and valuable information. Through this technique it has been possible to determine that the child population is dying mostly from malnutrition. In short, this technique has been very useful to develop this study; it has allowed us to transform large amounts of information into a conclusive and important statement, which has made it easier to take appropriate steps to resolve a particular situation.
Keywords: Malnutrition, datamining, analytical, descriptive, population, wayuu, indigenous.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 6966151 Air Handling Units Power Consumption Using Generalized Additive Model for Anomaly Detection: A Case Study in a Singapore Campus
Authors: Ju Peng Poh, Jun Yu Charles Lee, Jonathan Chew Hoe Khoo
Abstract:
The emergence of digital twin technology, a digital replica of physical world, has improved the real-time access to data from sensors about the performance of buildings. This digital transformation has opened up many opportunities to improve the management of the building by using the data collected to help monitor consumption patterns and energy leakages. One example is the integration of predictive models for anomaly detection. In this paper, we use the GAM (Generalised Additive Model) for the anomaly detection of Air Handling Units (AHU) power consumption pattern. There is ample research work on the use of GAM for the prediction of power consumption at the office building and nation-wide level. However, there is limited illustration of its anomaly detection capabilities, prescriptive analytics case study, and its integration with the latest development of digital twin technology. In this paper, we applied the general GAM modelling framework on the historical data of the AHU power consumption and cooling load of the building between Jan 2018 to Aug 2019 from an education campus in Singapore to train prediction models that, in turn, yield predicted values and ranges. The historical data are seamlessly extracted from the digital twin for modelling purposes. We enhanced the utility of the GAM model by using it to power a real-time anomaly detection system based on the forward predicted ranges. The magnitude of deviation from the upper and lower bounds of the uncertainty intervals is used to inform and identify anomalous data points, all based on historical data, without explicit intervention from domain experts. Notwithstanding, the domain expert fits in through an optional feedback loop through which iterative data cleansing is performed. After an anomalously high or low level of power consumption detected, a set of rule-based conditions are evaluated in real-time to help determine the next course of action for the facilities manager. The performance of GAM is then compared with other approaches to evaluate its effectiveness. Lastly, we discuss the successfully deployment of this approach for the detection of anomalous power consumption pattern and illustrated with real-world use cases.
Keywords: Anomaly detection, digital twin, Generalised Additive Model, Power Consumption Model.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 5016150 Development of a Real-Time Energy Models for Photovoltaic Water Pumping System
Authors: Ammar Mahjoubi, Ridha Fethi Mechlouch, Belgacem Mahdhaoui, Ammar Ben Brahim
Abstract:
This purpose of this paper is to develop and validate a model to accurately predict the cell temperature of a PV module that adapts to various mounting configurations, mounting locations, and climates while only requiring readily available data from the module manufacturer. Results from this model are also compared to results from published cell temperature models. The models were used to predict real-time performance from a PV water pumping systems in the desert of Medenine, south of Tunisia using 60-min intervals of measured performance data during one complete year. Statistical analysis of the predicted results and measured data highlight possible sources of errors and the limitations and/or adequacy of existing models, to describe the temperature and efficiency of PV-cells and consequently, the accuracy of performance of PV water pumping systems prediction models.Keywords: Temperature of a photovoltaic module, Predicted models, PV water pumping systems efficiency, Simulation, Desert of southern Tunisia.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 18516149 Wind Farm Power Performance Verification Using Non-Parametric Statistical Inference
Authors: M. Celeska, K. Najdenkoski, V. Dimchev, V. Stoilkov
Abstract:
Accurate determination of wind turbine performance is necessary for economic operation of a wind farm. At present, the procedure to carry out the power performance verification of wind turbines is based on a standard of the International Electrotechnical Commission (IEC). In this paper, nonparametric statistical inference is applied to designing a simple, inexpensive method of verifying the power performance of a wind turbine. A statistical test is explained, examined, and the adequacy is tested over real data. The methods use the information that is collected by the SCADA system (Supervisory Control and Data Acquisition) from the sensors embedded in the wind turbines in order to carry out the power performance verification of a wind farm. The study has used data on the monthly output of wind farm in the Republic of Macedonia, and the time measuring interval was from January 1, 2016, to December 31, 2016. At the end, it is concluded whether the power performance of a wind turbine differed significantly from what would be expected. The results of the implementation of the proposed methods showed that the power performance of the specific wind farm under assessment was acceptable.
Keywords: Canonical correlation analysis, power curve, power performance, wind energy.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 10366148 Knowledge Discovery Techniques for Talent Forecasting in Human Resource Application
Authors: Hamidah Jantan, Abdul Razak Hamdan, Zulaiha Ali Othman
Abstract:
Human Resource (HR) applications can be used to provide fair and consistent decisions, and to improve the effectiveness of decision making processes. Besides that, among the challenge for HR professionals is to manage organization talents, especially to ensure the right person for the right job at the right time. For that reason, in this article, we attempt to describe the potential to implement one of the talent management tasks i.e. identifying existing talent by predicting their performance as one of HR application for talent management. This study suggests the potential HR system architecture for talent forecasting by using past experience knowledge known as Knowledge Discovery in Database (KDD) or Data Mining. This article consists of three main parts; the first part deals with the overview of HR applications, the prediction techniques and application, the general view of Data mining and the basic concept of talent management in HRM. The second part is to understand the use of Data Mining technique in order to solve one of the talent management tasks, and the third part is to propose the potential HR system architecture for talent forecasting.Keywords: HR Application, Knowledge Discovery inDatabase (KDD), Talent Forecasting.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 44826147 Terminal Velocity of a Bubble Rise in a Liquid Column
Authors: Mário A. R. Talaia
Abstract:
As it is known, buoyancy and drag forces rule bubble's rise velocity in a liquid column. These forces are strongly dependent on fluid properties, gravity as well as equivalent's diameter. This study reports a set of bubble rising velocity experiments in a liquid column using water or glycerol. Several records of terminal velocity were obtained. The results show that bubble's rise terminal velocity is strongly dependent on dynamic viscosity effect. The data set allowed to have some terminal velocities data interval of 8.0 ? 32.9 cm/s with Reynolds number interval 1.3 -7490. The bubble's movement was recorded with a video camera. The main goal is to present an original set data and results that will be discussed based on two-phase flow's theory. It will also discussed, the prediction of terminal velocity of a single bubble in liquid, as well as the range of its applicability. In conclusion, this study presents general expressions for the determination of the terminal velocity of isolated gas bubbles of a Reynolds number range, when the fluid proprieties are known.Keywords: Bubbles, terminal velocity, two phase-flow, vertical column.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 186076146 Application of Artificial Neural Network to Classification Surface Water Quality
Authors: S. Wechmongkhonkon, N.Poomtong, S. Areerachakul
Abstract:
Water quality is a subject of ongoing concern. Deterioration of water quality has initiated serious management efforts in many countries. This study endeavors to automatically classify water quality. The water quality classes are evaluated using 6 factor indices. These factors are pH value (pH), Dissolved Oxygen (DO), Biochemical Oxygen Demand (BOD), Nitrate Nitrogen (NO3N), Ammonia Nitrogen (NH3N) and Total Coliform (TColiform). The methodology involves applying data mining techniques using multilayer perceptron (MLP) neural network models. The data consisted of 11 sites of canals in Dusit district in Bangkok, Thailand. The data is obtained from the Department of Drainage and Sewerage Bangkok Metropolitan Administration during 2007-2011. The results of multilayer perceptron neural network exhibit a high accuracy multilayer perception rate at 96.52% in classifying the water quality of Dusit district canal in Bangkok Subsequently, this encouraging result could be applied with plan and management source of water quality.Keywords: artificial neural network, classification, surface water quality
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 32096145 Effects of Li2O Thickness and Moisture Content on LiH Hydrolysis Kinetics in Slightly Humidified Argon
Authors: S. Xiao, M. B. Shuai, M. F. Chu
Abstract:
The hydrolysis kinetics of polycrystalline lithium hydride (LiH) in argon at various low humidities was measured by gravimetry and Raman spectroscopy with ambient water concentration ranging from 200 to 1200 ppm. The results showed that LiH hydrolysis curve revealed a paralinear shape, which was attributed to two different reaction stages that forming different products as explained by the 'Layer Diffusion Control' model. Based on the model, a novel two-stage rate equation for LiH hydrolysis reactions was developed and used to fit the experimental data for determination of Li2O steady thickness Hs and the ultimate hydrolysis rate vs. The fitted data presented a rise of Hs as ambient water concentration cw increased. However, in spite of the negative effect imposed by Hs increasing, the upward trend of vs remained, which implied that water concentration, rather than Li2O thickness, played a predominant role in LiH hydrolysis kinetics. In addition, the proportional relationship between vsHs and cw predicted by rate equation and confirmed by gravimetric data validated the model in such conditions.
Keywords: Hydrolysis kinetics, ‘Layer Diffusion Control’ model, Lithium hydride
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 17016144 A Study on Removal of Toluidine Blue Dye from Aqueous Solution by Adsorption onto Neem Leaf Powder
Authors: Himanshu Patel, R. T. Vashi
Abstract:
Adsorption of Toluidine blue dye from aqueous solutions onto Neem Leaf Powder (NLP) has been investigated. The surface characterization of this natural material was examined by Particle size analysis, Scanning Electron Microscopy (SEM), Fourier Transform Infrared (FTIR) spectroscopy and X-Ray Diffraction (XRD). The effects of process parameters such as initial concentration, pH, temperature and contact duration on the adsorption capacities have been evaluated, in which pH has been found to be most effective parameter among all. The data were analyzed using the Langmuir and Freundlich for explaining the equilibrium characteristics of adsorption. And kinetic models like pseudo first- order, second-order model and Elovich equation were utilized to describe the kinetic data. The experimental data were well fitted with Langmuir adsorption isotherm model and pseudo second order kinetic model. The thermodynamic parameters, such as Free energy of adsorption (AG"), enthalpy change (AH') and entropy change (AS°) were also determined and evaluated.
Keywords: Adsorption, isotherm models, kinetic models, temperature, toluidine blue dye, surface chemistry.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 17976143 Military Court’s Jurisdiction over Military Members Who Commit General Crimes under Indonesian Military Judiciary System in Comparison with Other Countries
Authors: Dini Dewi Heniarti
Abstract:
The importance of this study is to understand how Indonesian military court asserts its jurisdiction over military members who commit general crimes within the Indonesian military judiciary system in comparison to other countries. This research employs a normative-juridical approach in combination with historical and comparative-juridical approaches. The research specification is analytical-descriptive in nature, i.e. describing or outlining the principles, basic concepts, and norms related to military judiciary system, which are further analyzed within the context of implementation and as the inputs for military justice regulation under the Indonesian legal system. Main data used in this research are secondary data, including primary, secondary and tertiary legal sources. The research focuses on secondary data, while primary data are supplementary in nature. The validity of data is checked using multi-methods commonly known as triangulation, i.e. to reflect the efforts to gain an in-depth understanding of phenomena being studied. Here, the military element is kept intact in the judiciary process with due observance of the Military Criminal Justice System and the Military Command Development Principle. The Indonesian military judiciary jurisdiction over military members committing general crimes is based on national legal system and global development while taking into account the structure, composition and position of military forces within the state structure. Jurisdiction is formulated by setting forth the substantive norm of crimes that are military in nature. At the level of adjudication jurisdiction, the military court has a jurisdiction to adjudicate military personnel who commit general offences. At the level of execution jurisdiction, the military court has a jurisdiction to execute the sentence against military members who have been convicted with a final and binding judgement. Military court's jurisdiction needs to be expanded when the country is in the state of war.
Keywords: Military courts, Jurisdiction, Military members, Military justice system.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 24386142 An Evaluation Method of Accelerated Storage Life Test for Typical Mechanical and Electronic Products
Authors: Jinyong Yao, Hongzhi Li, Chao Du, Jiao Li
Abstract:
Reliability of long-term storage products is related to the availability of the whole system, and the evaluation of storage life is of great necessity. These products are usually highly reliable and little failure information can be collected. In this paper, an analytical method based on data from accelerated storage life test is proposed to evaluate the reliability index of the long-term storage products. Firstly, singularities are eliminated by data normalization and residual analysis. Secondly, with the preprocessed data, the degradation path model is built to obtain the pseudo life values. Then by life distribution hypothesis, we can get the estimator of parameters in high stress levels and verify failure mechanism consistency. Finally, the life distribution under the normal stress level is extrapolated via the acceleration model and evaluation of the actual average life is available. An application example with the camera stabilization device is provided to illustrate the methodology we proposed.
Keywords: Accelerated storage life test, failure mechanism consistency, life distribution, reliability.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 22856141 Personal Health Assistance Service Expert System (PHASES)
Authors: Chakkrit Snae, Michael Brueckner
Abstract:
In this paper the authors present the framework of a system for assisting users through counseling on personal health, the Personal Health Assistance Service Expert System (PHASES). Personal health assistance systems need Personal Health Records (PHR), which support wellness activities, improve the understanding of personal health issues, enable access to data from providers of health services, strengthen health promotion, and in the end improve the health of the population. This is especially important in societies where the health costs increase at a higher rate than the overall economy. The most important elements of a healthy lifestyle are related to food (such as balanced nutrition and diets), activities for body fitness (such as walking, sports, fitness programs), and other medical treatments (such as massage, prescriptions of drugs). The PHASES framework uses an ontology of food, which includes nutritional facts, an expert system keeping track of personal health data that are matched with medical treatments, and a comprehensive data transfer between patients and the system.Keywords: Personal health assistance service, expert system, ontologies, knowledge management, information technology.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 19136140 A Hybrid DEA Model for the Measurement of the Enviromental Performance
Authors: A. Hadi-Vencheh, N. Shayesteh Moghadam
Abstract:
Data envelopment analysis (DEA) has gained great popularity in environmental performance measurement because it can provide a synthetic standardized environmental performance index when pollutants are suitably incorporated into the traditional DEA framework. Since some of the environmental performance indicators cannot be controlled by companies managers, it is necessary to develop the model in a way that it could be applied when discretionary and/or non-discretionary factors were involved. In this paper, we present a semi-radial DEA approach to measuring environmental performance, which consists of non-discretionary factors. The model, then, has been applied on a real case.
Keywords: Environmental performance, efficiency, non-discretionary variables, data envelopment analysis.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 13776139 Platform Urbanism: Planning towards Hyper-Personalisation
Authors: Provides Ng
Abstract:
Platform economy is a peer-to-peer model of distributing resources facilitated by community-based digital platforms. In recent years, digital platforms are rapidly reconfiguring the public realm using hyper-personalisation techniques. This paper aims at investigating how urban planning can leapfrog into the digital age to help relieve the rising tension of the global issue of labour flow; it discusses the means to transfer techniques of hyper-personalisation into urban planning for plasticity using platform technologies. This research first denotes the limitations of the current system of urban residency, where the system maintains itself on the circulation of documents, which are data on paper. Then, this paper tabulates how some of the institutions around the world, both public and private, digitise data, and streamline communications between a network of systems and citizens using platform technologies. Subsequently, this paper proposes ways in which hyper-personalisation can be utilised to form a digital planning platform. Finally, this paper concludes by reviewing how the proposed strategy may help to open up new ways of thinking about how we affiliate ourselves with cities.
Keywords: Platform urbanism, hyper-personalisation, urban residency, digital data.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 6696138 Investigation of Tbilisi City Atmospheric Air Pollution with PM in Usual and Emergency Situations Using the Observational and Numerical Modeling Data
Authors: N. Gigauri, V. Kukhalashvili, V. Sesadze, A. Surmava, L. Intskirveli
Abstract:
Pollution of the Tbilisi atmospheric air with PM2.5 and PM10 in usual and pandemic situations by using the data of 5 stationary observation points is investigated. The values of the statistical characteristic parameters of PM in the atmosphere of Tbilisi are analyzed and trend graphs are constructed. By means of analysis of pollution levels in the quarantine and usual periods the proportion of vehicle traffic in pollution of city is estimated. Experimental measurements of PM2.5, PM10 in the atmosphere have been carried out in different districts of the city and map of the distribution of their concentrations were constructed. It is shown that maximum pollution values are recorded in the city center and along major motorways. It is shown that the average monthly concentrations vary in the range of 0.6-1.6 Maximum Permissible Concentration (MPC). Average daily values of concentration vary at 2-4 days intervals. The distribution of PM10 generated as a result of traffic is numerical modeled. The modeling results are compared with the observation data.
Keywords: Air pollution, numerical modeling, PM2.5, PM10.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 5766137 Optimized Preprocessing for Accurate and Efficient Bioassay Prediction with Machine Learning Algorithms
Authors: Jeff Clarine, Chang-Shyh Peng, Daisy Sang
Abstract:
Bioassay is the measurement of the potency of a chemical substance by its effect on a living animal or plant tissue. Bioassay data and chemical structures from pharmacokinetic and drug metabolism screening are mined from and housed in multiple databases. Bioassay prediction is calculated accordingly to determine further advancement. This paper proposes a four-step preprocessing of datasets for improving the bioassay predictions. The first step is instance selection in which dataset is categorized into training, testing, and validation sets. The second step is discretization that partitions the data in consideration of accuracy vs. precision. The third step is normalization where data are normalized between 0 and 1 for subsequent machine learning processing. The fourth step is feature selection where key chemical properties and attributes are generated. The streamlined results are then analyzed for the prediction of effectiveness by various machine learning algorithms including Pipeline Pilot, R, Weka, and Excel. Experiments and evaluations reveal the effectiveness of various combination of preprocessing steps and machine learning algorithms in more consistent and accurate prediction.
Keywords: Bioassay, machine learning, preprocessing, virtual screen.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 9816136 Removal of Methylene Blue from Aqueous Solution by Using Gypsum as a Low Cost Adsorbent
Authors: Muhammad A.Rauf, I.Shehadeh, Amal Ahmed, Ahmed Al-Zamly
Abstract:
Removal of Methylene Blue (MB) from aqueous solution by adsorbing it on Gypsum was investigated by batch method. The studies were conducted at 25°C and included the effects of pH and initial concentration of Methylene Blue. The adsorption data was analyzed by using the Langmuir, Freundlich and Tempkin isotherm models. The maximum monolayer adsorption capacity was found to be 36 mg of the dye per gram of gypsum. The data were also analyzed in terms of their kinetic behavior and was found to obey the pseudo second order equation.Keywords: Adsorption, Dye, Gypsum, Kinetics, Methylene Blue.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 26796135 Lexical Database for Multiple Languages: Multilingual Word Semantic Network
Authors: K. K. Yong, R. Mahmud, C. S. Woo
Abstract:
Data mining and knowledge engineering have become a tough task due to the availability of large amount of data in the web nowadays. Validity and reliability of data also become a main debate in knowledge acquisition. Besides, acquiring knowledge from different languages has become another concern. There are many language translators and corpora developed but the function of these translators and corpora are usually limited to certain languages and domains. Furthermore, search results from engines with traditional 'keyword' approach are no longer satisfying. More intelligent knowledge engineering agents are needed. To address to these problems, a system known as Multilingual Word Semantic Network is proposed. This system adapted semantic network to organize words according to concepts and relations. The system also uses open source as the development philosophy to enable the native language speakers and experts to contribute their knowledge to the system. The contributed words are then defined and linked using lexical and semantic relations. Thus, related words and derivatives can be identified and linked. From the outcome of the system implementation, it contributes to the development of semantic web and knowledge engineering.
Keywords: Multilingual, semantic network, intelligent knowledge engineering.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 19636134 Decision Trees for Predicting Risk of Mortality using Routinely Collected Data
Authors: Tessy Badriyah, Jim S. Briggs, Dave R. Prytherch
Abstract:
It is well known that Logistic Regression is the gold standard method for predicting clinical outcome, especially predicting risk of mortality. In this paper, the Decision Tree method has been proposed to solve specific problems that commonly use Logistic Regression as a solution. The Biochemistry and Haematology Outcome Model (BHOM) dataset obtained from Portsmouth NHS Hospital from 1 January to 31 December 2001 was divided into four subsets. One subset of training data was used to generate a model, and the model obtained was then applied to three testing datasets. The performance of each model from both methods was then compared using calibration (the χ2 test or chi-test) and discrimination (area under ROC curve or c-index). The experiment presented that both methods have reasonable results in the case of the c-index. However, in some cases the calibration value (χ2) obtained quite a high result. After conducting experiments and investigating the advantages and disadvantages of each method, we can conclude that Decision Trees can be seen as a worthy alternative to Logistic Regression in the area of Data Mining.Keywords: Decision Trees, Logistic Regression, clinical outcome, risk of mortality.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 25236133 From Industry 4.0 to Agriculture 4.0: A Framework to Manage Product Data in Agri-Food Supply Chain for Voluntary Traceability
Authors: Angelo Corallo, Maria Elena Latino, Marta Menegoli
Abstract:
Agri-food value chain involves various stakeholders with different roles. All of them abide by national and international rules and leverage marketing strategies to advance their products. Food products and related processing phases carry with it a big mole of data that are often not used to inform final customer. Some data, if fittingly identified and used, can enhance the single company, and/or the all supply chain creates a math between marketing techniques and voluntary traceability strategies. Moreover, as of late, the world has seen buying-models’ modification: customer is careful on wellbeing and food quality. Food citizenship and food democracy was born, leveraging on transparency, sustainability and food information needs. Internet of Things (IoT) and Analytics, some of the innovative technologies of Industry 4.0, have a significant impact on market and will act as a main thrust towards a genuine ‘4.0 change’ for agriculture. But, realizing a traceability system is not simple because of the complexity of agri-food supply chain, a lot of actors involved, different business models, environmental variations impacting products and/or processes, and extraordinary climate changes. In order to give support to the company involved in a traceability path, starting from business model analysis and related business process a Framework to Manage Product Data in Agri-Food Supply Chain for Voluntary Traceability was conceived. Studying each process task and leveraging on modeling techniques lead to individuate information held by different actors during agri-food supply chain. IoT technologies for data collection and Analytics techniques for data processing supply information useful to increase the efficiency intra-company and competitiveness in the market. The whole information recovered can be shown through IT solutions and mobile application to made accessible to the company, the entire supply chain and the consumer with the view to guaranteeing transparency and quality.
Keywords: Agriculture 4.0, agri-food supply chain, Industry 4.0, voluntary traceability.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 23486132 Info-participation of the Disabled Using the Mixed Preference Data in Improving Their Travel Quality
Authors: Y. Duvarci, S. Mizokami
Abstract:
Today, the preferences and participation of the TD groups such as the elderly and disabled is still lacking in decision-making of transportation planning, and their reactions to certain type of policies are not well known. Thus, a clear methodology is needed. This study aimed to develop a method to extract the preferences of the disabled to be used in the policy-making stage that can also guide to future estimations. The method utilizes the combination of cluster analysis and data filtering using the data of the Arao city (Japan). The method is a process that follows: defining the TD group by the cluster analysis tool, their travel preferences in tabular form from the household surveys by policy variableimpact pairs, zones, and by trip purposes, and the final outcome is the preference probabilities of the disabled. The preferences vary by trip purpose; for the work trips, accessibility and transit system quality policies with the accompanying impacts of modal shifts towards public mode use as well as the decreasing travel costs, and the trip rate increase; for the social trips, the same accessibility and transit system policies leading to the same mode shift impact, together with the travel quality policy area leading to trip rate increase. These results explain the policies to focus and can be used in scenario generation in models, or any other planning purpose as decision support tool.
Keywords: Transportation Disadvantaged, Disabled, Mixed Preference, Stated Preference Data.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 10796131 Analyzing Current Transformer’s Transient and Steady State Behavior for Different Burden’s Using LabVIEW Data Acquisition Tool
Abstract:
Current transformers (CTs) are used to transform large primary currents to a small secondary current. Since most standard equipment’s are not designed to handle large primary currents the CTs have an important part in any electrical system for the purpose of Metering and Protection both of which are integral in Power system. Now a days due to advancement in solid state technology, the operation times of the protective relays have come to a few cycles from few seconds. Thus, in such a scenario it becomes important to study the transient response of the current transformers as it will play a vital role in the operating of the protective devices.
This paper shows the steady state and transient behavior of current transformers and how it changes with change in connected burden. The transient and steady state response will be captured using the data acquisition software LabVIEW. Analysis is done on the real time data gathered using LabVIEW. Variation of current transformer characteristics with changes in burden will be discussed.
Keywords: Accuracy, Accuracy limiting factor, Burden, Current Transformer, Instrument Security factor.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 33186130 A Metric-Set and Model Suggestion for Better Software Project Cost Estimation
Authors: Murat Ayyıldız, Oya Kalıpsız, Sırma Yavuz
Abstract:
Software project effort estimation is frequently seen as complex and expensive for individual software engineers. Software production is in a crisis. It suffers from excessive costs. Software production is often out of control. It has been suggested that software production is out of control because we do not measure. You cannot control what you cannot measure. During last decade, a number of researches on cost estimation have been conducted. The metric-set selection has a vital role in software cost estimation studies; its importance has been ignored especially in neural network based studies. In this study we have explored the reasons of those disappointing results and implemented different neural network models using augmented new metrics. The results obtained are compared with previous studies using traditional metrics. To be able to make comparisons, two types of data have been used. The first part of the data is taken from the Constructive Cost Model (COCOMO'81) which is commonly used in previous studies and the second part is collected according to new metrics in a leading international company in Turkey. The accuracy of the selected metrics and the data samples are verified using statistical techniques. The model presented here is based on Multi-Layer Perceptron (MLP). Another difficulty associated with the cost estimation studies is the fact that the data collection requires time and care. To make a more thorough use of the samples collected, k-fold, cross validation method is also implemented. It is concluded that, as long as an accurate and quantifiable set of metrics are defined and measured correctly, neural networks can be applied in software cost estimation studies with successKeywords: Software Metrics, Software Cost Estimation, Neural Network.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 19576129 Flow Duration Curves and Recession Curves Connection through a Mathematical Link
Authors: Elena Carcano, Mirzi Betasolo
Abstract:
This study helps Public Water Bureaus in giving reliable answers to water concession requests. Rapidly increasing water requests can be supported provided that further uses of a river course are not totally compromised, and environmental features are protected as well. Strictly speaking, a water concession can be considered a continuous drawing from the source and causes a mean annual streamflow reduction. Therefore, deciding if a water concession is appropriate or inappropriate seems to be easily solved by comparing the generic demand to the mean annual streamflow value at disposal. Still, the immediate shortcoming for such a comparison is that streamflow data are information available only for few catchments and, most often, limited to specific sites. Subsequently, comparing the generic water demand to mean daily discharge is indeed far from being completely satisfactory since the mean daily streamflow is greater than the water withdrawal for a long period of a year. Consequently, such a comparison appears to be of little significance in order to preserve the quality and the quantity of the river. In order to overcome such a limit, this study aims to complete the information provided by flow duration curves introducing a link between Flow Duration Curves (FDCs) and recession curves and aims to show the chronological sequence of flows with a particular focus on low flow data. The analysis is carried out on 25 catchments located in North-Eastern Italy for which daily data are provided. The results identify groups of catchments as hydrologically homogeneous, having the lower part of the FDCs (corresponding streamflow interval is streamflow Q between 300 and 335, namely: Q(300), Q(335)) smoothly reproduced by a common recession curve. In conclusion, the results are useful to provide more reliable answers to water request, especially for those catchments which show similar hydrological response and can be used for a focused regionalization approach on low flow data. A mathematical link between streamflow duration curves and recession curves is herein provided, thus furnishing streamflow duration curves information upon a temporal sequence of data. In such a way, by introducing assumptions on recession curves, the chronological sequence upon low flow data can also be attributed to FDCs, which are known to lack this information by nature.
Keywords: Chronological sequence of discharges, recession curves, streamflow duration curves, water concession.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 595