Search results for: data harvesting
Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 24934

Search results for: data harvesting

24154 Minimum Data of a Speech Signal as Special Indicators of Identification in Phonoscopy

Authors: Nazaket Gazieva

Abstract:

Voice biometric data associated with physiological, psychological and other factors are widely used in forensic phonoscopy. There are various methods for identifying and verifying a person by voice. This article explores the minimum speech signal data as individual parameters of a speech signal. Monozygotic twins are believed to be genetically identical. Using the minimum data of the speech signal, we came to the conclusion that the voice imprint of monozygotic twins is individual. According to the conclusion of the experiment, we can conclude that the minimum indicators of the speech signal are more stable and reliable for phonoscopic examinations.

Keywords: phonogram, speech signal, temporal characteristics, fundamental frequency, biometric fingerprints

Procedia PDF Downloads 132
24153 A Non-parametric Clustering Approach for Multivariate Geostatistical Data

Authors: Francky Fouedjio

Abstract:

Multivariate geostatistical data have become omnipresent in the geosciences and pose substantial analysis challenges. One of them is the grouping of data locations into spatially contiguous clusters so that data locations within the same cluster are more similar while clusters are different from each other, in some sense. Spatially contiguous clusters can significantly improve the interpretation that turns the resulting clusters into meaningful geographical subregions. In this paper, we develop an agglomerative hierarchical clustering approach that takes into account the spatial dependency between observations. It relies on a dissimilarity matrix built from a non-parametric kernel estimator of the spatial dependence structure of data. It integrates existing methods to find the optimal cluster number and to evaluate the contribution of variables to the clustering. The capability of the proposed approach to provide spatially compact, connected and meaningful clusters is assessed using bivariate synthetic dataset and multivariate geochemical dataset. The proposed clustering method gives satisfactory results compared to other similar geostatistical clustering methods.

Keywords: clustering, geostatistics, multivariate data, non-parametric

Procedia PDF Downloads 473
24152 Big Data in Telecom Industry: Effective Predictive Techniques on Call Detail Records

Authors: Sara ElElimy, Samir Moustafa

Abstract:

Mobile network operators start to face many challenges in the digital era, especially with high demands from customers. Since mobile network operators are considered a source of big data, traditional techniques are not effective with new era of big data, Internet of things (IoT) and 5G; as a result, handling effectively different big datasets becomes a vital task for operators with the continuous growth of data and moving from long term evolution (LTE) to 5G. So, there is an urgent need for effective Big data analytics to predict future demands, traffic, and network performance to full fill the requirements of the fifth generation of mobile network technology. In this paper, we introduce data science techniques using machine learning and deep learning algorithms: the autoregressive integrated moving average (ARIMA), Bayesian-based curve fitting, and recurrent neural network (RNN) are employed for a data-driven application to mobile network operators. The main framework included in models are identification parameters of each model, estimation, prediction, and final data-driven application of this prediction from business and network performance applications. These models are applied to Telecom Italia Big Data challenge call detail records (CDRs) datasets. The performance of these models is found out using a specific well-known evaluation criteria shows that ARIMA (machine learning-based model) is more accurate as a predictive model in such a dataset than the RNN (deep learning model).

Keywords: big data analytics, machine learning, CDRs, 5G

Procedia PDF Downloads 132
24151 A Data Mining Approach for Analysing and Predicting the Bank's Asset Liability Management Based on Basel III Norms

Authors: Nidhin Dani Abraham, T. K. Sri Shilpa

Abstract:

Asset liability management is an important aspect in banking business. Moreover, the today’s banking is based on BASEL III which strictly regulates on the counterparty default. This paper focuses on prediction and analysis of counter party default risk, which is a type of risk occurs when the customers fail to repay the amount back to the lender (bank or any financial institutions). This paper proposes an approach to reduce the counterparty risk occurring in the financial institutions using an appropriate data mining technique and thus predicts the occurrence of NPA. It also helps in asset building and restructuring quality. Liability management is very important to carry out banking business. To know and analyze the depth of liability of bank, a suitable technique is required. For that a data mining technique is being used to predict the dormant behaviour of various deposit bank customers. Various models are implemented and the results are analyzed of saving bank deposit customers. All these data are cleaned using data cleansing approach from the bank data warehouse.

Keywords: data mining, asset liability management, BASEL III, banking

Procedia PDF Downloads 543
24150 A Dynamic Ensemble Learning Approach for Online Anomaly Detection in Alibaba Datacenters

Authors: Wanyi Zhu, Xia Ming, Huafeng Wang, Junda Chen, Lu Liu, Jiangwei Jiang, Guohua Liu

Abstract:

Anomaly detection is a first and imperative step needed to respond to unexpected problems and to assure high performance and security in large data center management. This paper presents an online anomaly detection system through an innovative approach of ensemble machine learning and adaptive differentiation algorithms, and applies them to performance data collected from a continuous monitoring system for multi-tier web applications running in Alibaba data centers. We evaluate the effectiveness and efficiency of this algorithm with production traffic data and compare with the traditional anomaly detection approaches such as a static threshold and other deviation-based detection techniques. The experiment results show that our algorithm correctly identifies the unexpected performance variances of any running application, with an acceptable false positive rate. This proposed approach has already been deployed in real-time production environments to enhance the efficiency and stability in daily data center operations.

Keywords: Alibaba data centers, anomaly detection, big data computation, dynamic ensemble learning

Procedia PDF Downloads 190
24149 Unsupervised Text Mining Approach to Early Warning System

Authors: Ichihan Tai, Bill Olson, Paul Blessner

Abstract:

Traditional early warning systems that alarm against crisis are generally based on structured or numerical data; therefore, a system that can make predictions based on unstructured textual data, an uncorrelated data source, is a great complement to the traditional early warning systems. The Chicago Board Options Exchange (CBOE) Volatility Index (VIX), commonly referred to as the fear index, measures the cost of insurance against market crash, and spikes in the event of crisis. In this study, news data is consumed for prediction of whether there will be a market-wide crisis by predicting the movement of the fear index, and the historical references to similar events are presented in an unsupervised manner. Topic modeling-based prediction and representation are made based on daily news data between 1990 and 2015 from The Wall Street Journal against VIX index data from CBOE.

Keywords: early warning system, knowledge management, market prediction, topic modeling.

Procedia PDF Downloads 328
24148 The Role of Synthetic Data in Aerial Object Detection

Authors: Ava Dodd, Jonathan Adams

Abstract:

The purpose of this study is to explore the characteristics of developing a machine learning application using synthetic data. The study is structured to develop the application for the purpose of deploying the computer vision model. The findings discuss the realities of attempting to develop a computer vision model for practical purpose, and detail the processes, tools, and techniques that were used to meet accuracy requirements. The research reveals that synthetic data represents another variable that can be adjusted to improve the performance of a computer vision model. Further, a suite of tools and tuning recommendations are provided.

Keywords: computer vision, machine learning, synthetic data, YOLOv4

Procedia PDF Downloads 212
24147 Perception-Oriented Model Driven Development for Designing Data Acquisition Process in Wireless Sensor Networks

Authors: K. Indra Gandhi

Abstract:

Wireless Sensor Networks (WSNs) have always been characterized for application-specific sensing, relaying and collection of information for further analysis. However, software development was not considered as a separate entity in this process of data collection which has posed severe limitations on the software development for WSN. Software development for WSN is a complex process since the components involved are data-driven, network-driven and application-driven in nature. This implies that there is a tremendous need for the separation of concern from the software development perspective. A layered approach for developing data acquisition design based on Model Driven Development (MDD) has been proposed as the sensed data collection process itself varies depending upon the application taken into consideration. This work focuses on the layered view of the data acquisition process so as to ease the software point of development. A metamodel has been proposed that enables reusability and realization of the software development as an adaptable component for WSN systems. Further, observing users perception indicates that proposed model helps in improving the programmer's productivity by realizing the collaborative system involved.

Keywords: data acquisition, model-driven development, separation of concern, wireless sensor networks

Procedia PDF Downloads 424
24146 Comparative Analysis of Data Gathering Protocols with Multiple Mobile Elements for Wireless Sensor Network

Authors: Bhat Geetalaxmi Jairam, D. V. Ashoka

Abstract:

Wireless Sensor Networks are used in many applications to collect sensed data from different sources. Sensed data has to be delivered through sensors wireless interface using multi-hop communication towards the sink. The data collection in wireless sensor networks consumes energy. Energy consumption is the major constraints in WSN .Reducing the energy consumption while increasing the amount of generated data is a great challenge. In this paper, we have implemented two data gathering protocols with multiple mobile sinks/elements to collect data from sensor nodes. First, is Energy-Efficient Data Gathering with Tour Length-Constrained Mobile Elements in Wireless Sensor Networks (EEDG), in which mobile sinks uses vehicle routing protocol to collect data. Second is An Intelligent Agent-based Routing Structure for Mobile Sinks in WSNs (IAR), in which mobile sinks uses prim’s algorithm to collect data. Authors have implemented concepts which are common to both protocols like deployment of mobile sinks, generating visiting schedule, collecting data from the cluster member. Authors have compared the performance of both protocols by taking statistics based on performance parameters like Delay, Packet Drop, Packet Delivery Ratio, Energy Available, Control Overhead. Authors have concluded this paper by proving EEDG is more efficient than IAR protocol but with few limitations which include unaddressed issues likes Redundancy removal, Idle listening, Mobile Sink’s pause/wait state at the node. In future work, we plan to concentrate more on these limitations to avail a new energy efficient protocol which will help in improving the life time of the WSN.

Keywords: aggregation, consumption, data gathering, efficiency

Procedia PDF Downloads 484
24145 Extraction and Characterization of Ethiopian Hibiscus macranthus Bast Fiber

Authors: Solomon Tilahun Desisa, Muktar Seid Hussen

Abstract:

Hibiscus macranthus is one of family Malvaceae and genus Hibiscus plant which grows mainly in western part of Ethiopia. Hibiscus macranthus is the most adaptable and abundant plant in the nation, which are used as an ornamental plant often a hedge or fence plant, and used as a firewood after harvesting the stem together with the bark, and used also as a fiber for trying different kinds of things by forming the rope. However, Hibiscus macranthus plant fibre has not been commercially exploited and extracted properly. This study of work describes the possibility of mechanical and retting methods of Hibiscus macranthus fibre extraction and characterization. Hibiscus macranthus fibre is a bast fibre which obtained naturally from the stem or stalks of the dicotyledonous plant since it is a natural cellulose plant fiber. And the fibre characterized by studying its physical and chemical properties. The physical characteristics were investigated as follows, including the length of 100-190mm, fineness of 1.0-1.2Tex, diameter under X100 microscopic view 16-21 microns, the moisture content of 12.46% and dry tenacity of 48-57cN/Tex along with breaking extension of 0.9-1.6%. Hibiscus macranthus fiber productivity was observed that 12-18% of the stem out of which more than 65% is primary long fibers. The fiber separation methods prove to decrease of non-cellulose ingredients in the order of mechanical, water and chemical methods. The color measurement also shows the raw Hibiscus macranthus fiber has a natural golden color according to YID1925 and paler look under both retting methods than mechanical separation. Finally, it is suggested that Hibiscus macranthus fibre can be used for manufacturing of natural and organic crop and coffee packages as well as super absorbent, fine and high tenacity textile products.

Keywords: Hibiscus macranthus, bast fiber, extraction, characterization

Procedia PDF Downloads 199
24144 Status and Results from EXO-200

Authors: Ryan Maclellan

Abstract:

EXO-200 has provided one of the most sensitive searches for neutrinoless double-beta decay utilizing 175 kg of enriched liquid xenon in an ultra-low background time projection chamber. This detector has demonstrated excellent energy resolution and background rejection capabilities. Using the first two years of data, EXO-200 has set a limit of 1.1x10^25 years at 90% C.L. on the neutrinoless double-beta decay half-life of Xe-136. The experiment has experienced a brief hiatus in data taking during a temporary shutdown of its host facility: the Waste Isolation Pilot Plant. EXO-200 expects to resume data taking in earnest this fall with upgraded detector electronics. Results from the analysis of EXO-200 data and an update on the current status of EXO-200 will be presented.

Keywords: double-beta, Majorana, neutrino, neutrinoless

Procedia PDF Downloads 404
24143 Remaining Useful Life (RUL) Assessment Using Progressive Bearing Degradation Data and ANN Model

Authors: Amit R. Bhende, G. K. Awari

Abstract:

Remaining useful life (RUL) prediction is one of key technologies to realize prognostics and health management that is being widely applied in many industrial systems to ensure high system availability over their life cycles. The present work proposes a data-driven method of RUL prediction based on multiple health state assessment for rolling element bearings. Bearing degradation data at three different conditions from run to failure is used. A RUL prediction model is separately built in each condition. Feed forward back propagation neural network models are developed for prediction modeling.

Keywords: bearing degradation data, remaining useful life (RUL), back propagation, prognosis

Procedia PDF Downloads 430
24142 Spatio-Temporal Data Mining with Association Rules for Lake Van

Authors: Tolga Aydin, M. Fatih Alaeddinoğlu

Abstract:

People, throughout the history, have made estimates and inferences about the future by using their past experiences. Developing information technologies and the improvements in the database management systems make it possible to extract useful information from knowledge in hand for the strategic decisions. Therefore, different methods have been developed. Data mining by association rules learning is one of such methods. Apriori algorithm, one of the well-known association rules learning algorithms, is not commonly used in spatio-temporal data sets. However, it is possible to embed time and space features into the data sets and make Apriori algorithm a suitable data mining technique for learning spatio-temporal association rules. Lake Van, the largest lake of Turkey, is a closed basin. This feature causes the volume of the lake to increase or decrease as a result of change in water amount it holds. In this study, evaporation, humidity, lake altitude, amount of rainfall and temperature parameters recorded in Lake Van region throughout the years are used by the Apriori algorithm and a spatio-temporal data mining application is developed to identify overflows and newly-formed soil regions (underflows) occurring in the coastal parts of Lake Van. Identifying possible reasons of overflows and underflows may be used to alert the experts to take precautions and make the necessary investments.

Keywords: apriori algorithm, association rules, data mining, spatio-temporal data

Procedia PDF Downloads 363
24141 Building Data Infrastructure for Public Use and Informed Decision Making in Developing Countries-Nigeria

Authors: Busayo Fashoto, Abdulhakeem Shaibu, Justice Agbadu, Samuel Aiyeoribe

Abstract:

Data has gone from just rows and columns to being an infrastructure itself. The traditional medium of data infrastructure has been managed by individuals in different industries and saved on personal work tools; one of such is the laptop. This hinders data sharing and Sustainable Development Goal (SDG) 9 for infrastructure sustainability across all countries and regions. However, there has been a constant demand for data across different agencies and ministries by investors and decision-makers. The rapid development and adoption of open-source technologies that promote the collection and processing of data in new ways and in ever-increasing volumes are creating new data infrastructure in sectors such as lands and health, among others. This paper examines the process of developing data infrastructure and, by extension, a data portal to provide baseline data for sustainable development and decision making in Nigeria. This paper employs the FAIR principle (Findable, Accessible, Interoperable, and Reusable) of data management using open-source technology tools to develop data portals for public use. eHealth Africa, an organization that uses technology to drive public health interventions in Nigeria, developed a data portal which is a typical data infrastructure that serves as a repository for various datasets on administrative boundaries, points of interest, settlements, social infrastructure, amenities, and others. This portal makes it possible for users to have access to datasets of interest at any point in time at no cost. A skeletal infrastructure of this data portal encompasses the use of open-source technology such as Postgres database, GeoServer, GeoNetwork, and CKan. These tools made the infrastructure sustainable, thus promoting the achievement of SDG 9 (Industries, Innovation, and Infrastructure). As of 6th August 2021, a wider cross-section of 8192 users had been created, 2262 datasets had been downloaded, and 817 maps had been created from the platform. This paper shows the use of rapid development and adoption of technologies that facilitates data collection, processing, and publishing in new ways and in ever-increasing volumes. In addition, the paper is explicit on new data infrastructure in sectors such as health, social amenities, and agriculture. Furthermore, this paper reveals the importance of cross-sectional data infrastructures for planning and decision making, which in turn can form a central data repository for sustainable development across developing countries.

Keywords: data portal, data infrastructure, open source, sustainability

Procedia PDF Downloads 83
24140 Process Data-Driven Representation of Abnormalities for Efficient Process Control

Authors: Hyun-Woo Cho

Abstract:

Unexpected operational events or abnormalities of industrial processes have a serious impact on the quality of final product of interest. In terms of statistical process control, fault detection and diagnosis of processes is one of the essential tasks needed to run the process safely. In this work, nonlinear representation of process measurement data is presented and evaluated using a simulation process. The effect of using different representation methods on the diagnosis performance is tested in terms of computational efficiency and data handling. The results have shown that the nonlinear representation technique produced more reliable diagnosis results and outperforms linear methods. The use of data filtering step improved computational speed and diagnosis performance for test data sets. The presented scheme is different from existing ones in that it attempts to extract the fault pattern in the reduced space, not in the original process variable space. Thus this scheme helps to reduce the sensitivity of empirical models to noise.

Keywords: fault diagnosis, nonlinear technique, process data, reduced spaces

Procedia PDF Downloads 242
24139 Text-to-Speech in Azerbaijani Language via Transfer Learning in a Low Resource Environment

Authors: Dzhavidan Zeinalov, Bugra Sen, Firangiz Aslanova

Abstract:

Most text-to-speech models cannot operate well in low-resource languages and require a great amount of high-quality training data to be considered good enough. Yet, with the improvements made in ASR systems, it is now much easier than ever to collect data for the design of custom text-to-speech models. In this work, our work on using the ASR model to collect data to build a viable text-to-speech system for one of the leading financial institutions of Azerbaijan will be outlined. NVIDIA’s implementation of the Tacotron 2 model was utilized along with the HiFiGAN vocoder. As for the training, the model was first trained with high-quality audio data collected from the Internet, then fine-tuned on the bank’s single speaker call center data. The results were then evaluated by 50 different listeners and got a mean opinion score of 4.17, displaying that our method is indeed viable. With this, we have successfully designed the first text-to-speech model in Azerbaijani and publicly shared 12 hours of audiobook data for everyone to use.

Keywords: Azerbaijani language, HiFiGAN, Tacotron 2, text-to-speech, transfer learning, whisper

Procedia PDF Downloads 33
24138 An Empirical Evaluation of Performance of Machine Learning Techniques on Imbalanced Software Quality Data

Authors: Ruchika Malhotra, Megha Khanna

Abstract:

The development of change prediction models can help the software practitioners in planning testing and inspection resources at early phases of software development. However, a major challenge faced during the training process of any classification model is the imbalanced nature of the software quality data. A data with very few minority outcome categories leads to inefficient learning process and a classification model developed from the imbalanced data generally does not predict these minority categories correctly. Thus, for a given dataset, a minority of classes may be change prone whereas a majority of classes may be non-change prone. This study explores various alternatives for adeptly handling the imbalanced software quality data using different sampling methods and effective MetaCost learners. The study also analyzes and justifies the use of different performance metrics while dealing with the imbalanced data. In order to empirically validate different alternatives, the study uses change data from three application packages of open-source Android data set and evaluates the performance of six different machine learning techniques. The results of the study indicate extensive improvement in the performance of the classification models when using resampling method and robust performance measures.

Keywords: change proneness, empirical validation, imbalanced learning, machine learning techniques, object-oriented metrics

Procedia PDF Downloads 412
24137 Influence of Pseudomonas japonica on Growth and Metal Tolerance of Celosia cristata L.

Authors: Muhammad Umair Mushtaq, Ameena Iqbal, Muhammad Aqib Hassan Ali Khan, Ismat Nawaz, Sohail Yousaf, Mazhar Iqbal

Abstract:

Heavy metals are one of the priority pollutants as they pose serious health and environmental threats. They can be removed by various physiochemical methods but are costly and responsible for additional environmental problems. Bioremediation that exploits plants and their associated microbes have been referred as cost effective and environmental friendly technique. In this study, a pot experiment was conducted in a greenhouse to evaluate the potential of Celosia cristata and effects of bacteria, Pseudomonas japonica, and organic amendment moss/compost on tolerating/accumulating heavy metals. Two weeks old seedlings were transferred to soil in pots, and after four weeks they were inoculated with bacterial strain, while after growth of six weeks they were watered with a metal containing synthetic wastewater and were harvested after a growth period of nine weeks. After harvesting, morphological and physiological parameters and metal content of plants were measured. The results showed highest plant growth and biomass production in case of organic amendments while highest metal uptake has been found in non-amended pots. Positive controls have shown highest Pb uptake of 2900 mg/kg DW, while P. japonica amended pots have shown highest Cd, Cr, Ni and Cu uptake of 963.53, 1481.17, 1022.01 and 602.17 mg/kg DW, respectively. In conclusion organic amendments have strong impacts on growth enhancement while P. japonica enhances metal translocation and accumulation to aerial parts with little significant involvement in plant growth.

Keywords: ornamental plants, plant microbe interaction, amendments, bacteria

Procedia PDF Downloads 283
24136 Quality of Age Reporting from Tanzania 2012 Census Results: An Assessment Using Whipple’s Index, Myer’s Blended Index, and Age-Sex Accuracy Index

Authors: A. Sathiya Susuman, Hamisi F. Hamisi

Abstract:

Background: Many socio-economic and demographic data are age-sex attributed. However, a variety of irregularities and misstatement are noted with respect to age-related data and less to sex data because of its biological differences between the genders. Noting the misstatement/misreporting of age data regardless of its significance importance in demographics and epidemiological studies, this study aims at assessing the quality of 2012 Tanzania Population and Housing Census Results. Methods: Data for the analysis are downloaded from Tanzania National Bureau of Statistics. Age heaping and digit preference were measured using summary indices viz., Whipple’s index, Myers’ blended index, and Age-Sex Accuracy index. Results: The recorded Whipple’s index for both sexes was 154.43; male has the lowest index of about 152.65 while female has the highest index of about 156.07. For Myers’ blended index, the preferences were at digits ‘0’ and ‘5’ while avoidance were at digits ‘1’ and ‘3’ for both sexes. Finally, Age-sex index stood at 59.8 where sex ratio score was 5.82 and age ratio scores were 20.89 and 21.4 for males and female respectively. Conclusion: The evaluation of the 2012 PHC data using the demographic techniques has qualified the data inaccurate as the results of systematic heaping and digit preferences/avoidances. Thus, innovative methods in data collection along with measuring and minimizing errors using statistical techniques should be used to ensure accuracy of age data.

Keywords: age heaping, digit preference/avoidance, summary indices, Whipple’s index, Myer’s index, age-sex accuracy index

Procedia PDF Downloads 465
24135 Smart Laboratory for Clean Rivers in India - An Indo-Danish Collaboration

Authors: Nikhilesh Singh, Shishir Gaur, Anitha K. Sharma

Abstract:

Climate change and anthropogenic stress have severely affected ecosystems all over the globe. Indian rivers are under immense pressure, facing challenges like pollution, encroachment, extreme fluctuation in the flow regime, local ignorance and lack of coordination between stakeholders. To counter all these issues a holistic river rejuvenation plan is needed that tests, innovates and implements sustainable solutions in the river space for sustainable river management. Smart Laboratory for Clean Rivers (SLCR) an Indo-Danish collaboration project, provides a living lab setup that brings all the stakeholders (government agencies, academic and industrial partners and locals) together to engage, learn, co-creating and experiment for a clean and sustainable river that last for ages. Just like every mega project requires piloting, SLCR has opted for a small catchment of the Varuna River, located in the Middle Ganga Basin in India. Considering the integrated approach of river rejuvenation, SLCR embraces various techniques and upgrades for rejuvenation. Likely, maintaining flow in the channel in the lean period, Managed Aquifer Recharge (MAR) is a proven technology. In SLCR, Floa-TEM high-resolution lithological data is used in MAR models to have better decision-making for MAR structures nearby of the river to enhance the river aquifer exchanges. Furthermore, the concerns of quality in the river are a big issue. A city like Varanasi which is located in the last stretch of the river, generates almost 260 MLD of domestic waste in the catchment. The existing STP system is working at full efficiency. Instead of installing a new STP for the future, SLCR is upgrading those STPs with an IoT-based system that optimizes according to the nutrient load and energy consumption. SLCR also advocate nature-based solutions like a reed bed for the drains having less flow. In search of micropollutants, SLCR uses fingerprint analysis involves employing advanced techniques like chromatography and mass spectrometry to create unique chemical profiles. However, rejuvenation attempts cannot be possible without involving the entire catchment. A holistic water management plan that includes storm management, water harvesting structure to efficiently manage the flow of water in the catchment and installation of several buffer zones to restrict pollutants entering into the river. Similarly, carbon (emission and sequestration) is also an important parameter for the catchment. By adopting eco-friendly practices, a ripple effect positively influences the catchment's water dynamics and aids in the revival of river systems. SLCR has adopted 4 villages to make them carbon-neutral and water-positive. Moreover, for the 24×7 monitoring of the river and the catchment, robust IoT devices are going to be installed to observe, river and groundwater quality, groundwater level, river discharge and carbon emission in the catchment and ultimately provide fuel for the data analytics. In its completion, SLCR will provide a river restoration manual, which will strategise the detailed plan and way of implementation for stakeholders. Lastly, the entire process is planned in such a way that will be managed by local administrations and stakeholders equipped with capacity-building activity. This holistic approach makes SLCR unique in the field of river rejuvenation.

Keywords: sustainable management, holistic approach, living lab, integrated river management

Procedia PDF Downloads 49
24134 Model for Introducing Products to New Customers through Decision Tree Using Algorithm C4.5 (J-48)

Authors: Komol Phaisarn, Anuphan Suttimarn, Vitchanan Keawtong, Kittisak Thongyoun, Chaiyos Jamsawang

Abstract:

This article is intended to analyze insurance information which contains information on the customer decision when purchasing life insurance pay package. The data were analyzed in order to present new customers with Life Insurance Perfect Pay package to meet new customers’ needs as much as possible. The basic data of insurance pay package were collect to get data mining; thus, reducing the scattering of information. The data were then classified in order to get decision model or decision tree using Algorithm C4.5 (J-48). In the classification, WEKA tools are used to form the model and testing datasets are used to test the decision tree for the accurate decision. The validation of this model in classifying showed that the accurate prediction was 68.43% while 31.25% were errors. The same set of data were then tested with other models, i.e. Naive Bayes and Zero R. The results showed that J-48 method could predict more accurately. So, the researcher applied the decision tree in writing the program used to introduce the product to new customers to persuade customers’ decision making in purchasing the insurance package that meets the new customers’ needs as much as possible.

Keywords: decision tree, data mining, customers, life insurance pay package

Procedia PDF Downloads 420
24133 Low-Surface Roughness and High Optical Quality CdS Thin Film Grown by Modified Chemical Surface Deposition Method

Authors: A. Elsayed, M. H. Dewaidar, M. Ghali

Abstract:

We report on deposition of smooth, pinhole-free, low-surface roughness ( < 4nm) and high optical quality cadmium sulfide (CdS) thin films on glass substrates using our new method based on chemical surface deposition principle. In this method, cadmium acetate and thiourea are used as reactants under special growth conditions for deposition of CdS films. X-ray diffraction (XRD) measurements were used to examine the crystal structure properties of the deposited CdS films. In addition, UV-vis transmittance and low-temperature (4K) photoluminescence (PL) measurements were performed for quantifying optical properties of the deposited films. Interestingly, we found that XRD pattern of the deposited films has dramatically changed when the growth temperature was raised during the reaction. Namely, the XRD measurements reveal a structural change of CdS film from Cubic to Hexagonal phase upon increase in the growth temperature from 75 °C to 200 °C. Furthermore, the deposited films show high optical quality as confirmed from observation of both sharp edge in the transmittance spectra and strong PL intensity at room temperature. Also, we found a strong effect of the growth conditions on the optical band gap of the deposited films; where remarkable red-shift in the absorption edge with temperature is clearly seen in both transmission and PL spectra. Such tuning of both optical band gap and crystal structure of the deposited CdS films; can be utilized for tuning the electronic bands alignments between CdS and other light harvesting materials, like CuInGaSe or CdTe, for potential improvement in the efficiency of all-solution processed solar cells devices based on these heterostructures.

Keywords: thin film, CdS, new method, optical properties

Procedia PDF Downloads 253
24132 Enhanced Solar-Driven Evaporation Process via F-Mwcnts/Pvdf Photothermal Membrane for Forward Osmosis Draw Solution Recovery

Authors: Ayat N. El-Shazly, Dina Magdy Abdo, Hamdy Maamoun Abdel-Ghafar, Xiangju Song, Heqing Jiang

Abstract:

Product water recovery and draw solution (DS) reuse is the most energy-intensive stage in forwarding osmosis (FO) technology. Sucrose solution is the most suitable DS for FO application in food and beverages. However, sucrose DS recovery by conventional pressure-driven or thermal-driven concentration techniques consumes high energy. Herein, we developed a spontaneous and sustainable solar-driven evaporation process based on a photothermal membrane for the concentration and recovery of sucrose solution. The photothermal membrane is composed of multi-walled carbon nanotubes (f-MWCNTs)photothermal layer on a hydrophilic polyvinylidene fluoride (PVDF) substrate. The f-MWCNTs photothermal layer with a rough surface and interconnected network structures not only improves the light-harvesting and light-to-heat conversion performance but also facilitates the transport of water molecules. The hydrophilic PVDF substrate can promote the rapid transport of water for adequate water supply to the photothermal layer. As a result, the optimized f-MWCNTs/PVDF photothermal membrane exhibits an excellent light absorption of 95%, and a high surface temperature of 74 °C at 1 kW m−2 . Besides, it realizes an evaporation rate of 1.17 kg m−2 h−1 for 5% (w/v) of sucrose solution, which is about 5 times higher than that of the natural evaporation. The designed photothermal evaporation process is capable of concentrating sucrose solution efficiently from 5% to 75% (w/v), which has great potential in FO process and juice concentration.

Keywords: solar, pothothermal, membrane, MWCNT

Procedia PDF Downloads 94
24131 Exploring the Role of Data Mining in Crime Classification: A Systematic Literature Review

Authors: Faisal Muhibuddin, Ani Dijah Rahajoe

Abstract:

This in-depth exploration, through a systematic literature review, scrutinizes the nuanced role of data mining in the classification of criminal activities. The research focuses on investigating various methodological aspects and recent developments in leveraging data mining techniques to enhance the effectiveness and precision of crime categorization. Commencing with an exposition of the foundational concepts of crime classification and its evolutionary dynamics, this study details the paradigm shift from conventional methods towards approaches supported by data mining, addressing the challenges and complexities inherent in the modern crime landscape. Specifically, the research delves into various data mining techniques, including K-means clustering, Naïve Bayes, K-nearest neighbour, and clustering methods. A comprehensive review of the strengths and limitations of each technique provides insights into their respective contributions to improving crime classification models. The integration of diverse data sources takes centre stage in this research. A detailed analysis explores how the amalgamation of structured data (such as criminal records) and unstructured data (such as social media) can offer a holistic understanding of crime, enriching classification models with more profound insights. Furthermore, the study explores the temporal implications in crime classification, emphasizing the significance of considering temporal factors to comprehend long-term trends and seasonality. The availability of real-time data is also elucidated as a crucial element in enhancing responsiveness and accuracy in crime classification.

Keywords: data mining, classification algorithm, naïve bayes, k-means clustering, k-nearest neigbhor, crime, data analysis, sistematic literature review

Procedia PDF Downloads 58
24130 Assessing Supply Chain Performance through Data Mining Techniques: A Case of Automotive Industry

Authors: Emin Gundogar, Burak Erkayman, Nusret Sazak

Abstract:

Providing effective management performance through the whole supply chain is critical issue and hard to applicate. The proper evaluation of integrated data may conclude with accurate information. Analysing the supply chain data through OLAP (On-Line Analytical Processing) technologies may provide multi-angle view of the work and consolidation. In this study, association rules and classification techniques are applied to measure the supply chain performance metrics of an automotive manufacturer in Turkey. Main criteria and important rules are determined. The comparison of the results of the algorithms is presented.

Keywords: supply chain performance, performance measurement, data mining, automotive

Procedia PDF Downloads 501
24129 Multimodal Data Fusion Techniques in Audiovisual Speech Recognition

Authors: Hadeer M. Sayed, Hesham E. El Deeb, Shereen A. Taie

Abstract:

In the big data era, we are facing a diversity of datasets from different sources in different domains that describe a single life event. These datasets consist of multiple modalities, each of which has a different representation, distribution, scale, and density. Multimodal fusion is the concept of integrating information from multiple modalities in a joint representation with the goal of predicting an outcome through a classification task or regression task. In this paper, multimodal fusion techniques are classified into two main classes: model-agnostic techniques and model-based approaches. It provides a comprehensive study of recent research in each class and outlines the benefits and limitations of each of them. Furthermore, the audiovisual speech recognition task is expressed as a case study of multimodal data fusion approaches, and the open issues through the limitations of the current studies are presented. This paper can be considered a powerful guide for interested researchers in the field of multimodal data fusion and audiovisual speech recognition particularly.

Keywords: multimodal data, data fusion, audio-visual speech recognition, neural networks

Procedia PDF Downloads 100
24128 Knowledge-Driven Decision Support System Based on Knowledge Warehouse and Data Mining by Improving Apriori Algorithm with Fuzzy Logic

Authors: Pejman Hosseinioun, Hasan Shakeri, Ghasem Ghorbanirostam

Abstract:

In recent years, we have seen an increasing importance of research and study on knowledge source, decision support systems, data mining and procedure of knowledge discovery in data bases and it is considered that each of these aspects affects the others. In this article, we have merged information source and knowledge source to suggest a knowledge based system within limits of management based on storing and restoring of knowledge to manage information and improve decision making and resources. In this article, we have used method of data mining and Apriori algorithm in procedure of knowledge discovery one of the problems of Apriori algorithm is that, a user should specify the minimum threshold for supporting the regularity. Imagine that a user wants to apply Apriori algorithm for a database with millions of transactions. Definitely, the user does not have necessary knowledge of all existing transactions in that database, and therefore cannot specify a suitable threshold. Our purpose in this article is to improve Apriori algorithm. To achieve our goal, we tried using fuzzy logic to put data in different clusters before applying the Apriori algorithm for existing data in the database and we also try to suggest the most suitable threshold to the user automatically.

Keywords: decision support system, data mining, knowledge discovery, data discovery, fuzzy logic

Procedia PDF Downloads 323
24127 The Study of Dengue Fever Outbreak in Thailand Using Geospatial Techniques, Satellite Remote Sensing Data and Big Data

Authors: Tanapat Chongkamunkong

Abstract:

The objective of this paper is to present a practical use of Geographic Information System (GIS) to the public health from spatial correlation between multiple factors and dengue fever outbreak. Meteorological factors, demographic factors and environmental factors are compiled using GIS techniques along with the Global Satellite Mapping Remote Sensing (RS) data. We use monthly dengue fever cases, population density, precipitation, Digital Elevation Model (DEM) data. The scope cover study area under climate change of the El Niño–Southern Oscillation (ENSO) indicated by sea surface temperature (SST) and study area in 12 provinces of Thailand as remote sensing (RS) data from January 2007 to December 2014.

Keywords: dengue fever, sea surface temperature, Geographic Information System (GIS), remote sensing

Procedia PDF Downloads 185
24126 Model of Optimal Centroids Approach for Multivariate Data Classification

Authors: Pham Van Nha, Le Cam Binh

Abstract:

Particle swarm optimization (PSO) is a population-based stochastic optimization algorithm. PSO was inspired by the natural behavior of birds and fish in migration and foraging for food. PSO is considered as a multidisciplinary optimization model that can be applied in various optimization problems. PSO’s ideas are simple and easy to understand but PSO is only applied in simple model problems. We think that in order to expand the applicability of PSO in complex problems, PSO should be described more explicitly in the form of a mathematical model. In this paper, we represent PSO in a mathematical model and apply in the multivariate data classification. First, PSOs general mathematical model (MPSO) is analyzed as a universal optimization model. Then, Model of Optimal Centroids (MOC) is proposed for the multivariate data classification. Experiments were conducted on some benchmark data sets to prove the effectiveness of MOC compared with several proposed schemes.

Keywords: analysis of optimization, artificial intelligence based optimization, optimization for learning and data analysis, global optimization

Procedia PDF Downloads 201
24125 A Method of Manufacturing Low Cost Utility Robots and Vehicles

Authors: Gregory E. Ofili

Abstract:

Introduction and Objective: Climate change and a global economy mean farmers must adapt and gain access to affordable and reliable automation technologies. Key barriers include a lack of transportation, electricity, and internet service, coupled with costly enabling technologies and limited local subject matter expertise. Methodology/Approach: Resourcefulness is essential to mechanization on a farm. This runs contrary to the tech industry practice of planned obsolescence and disposal. One solution is plug-and-play hardware that allows farmer to assemble, repair, program, and service their own fleet of industrial machines. To that end, we developed a method of manufacturing low-cost utility robots, transport vehicles, and solar/wind energy harvesting systems, all running on an open-source Robot Operating System (ROS). We demonstrate this technology by fabricating a utility robot and an all-terrain (4X4) utility vehicle. Constructed of aluminum trusses and weighing just 40 pounds, yet capable of transporting 200 pounds of cargo, on sale for less than $2,000. Conclusions & Policy Implications: Electricity, internet, and automation are essential for productivity and competitiveness. With planned obsolescence, the priorities of technology suppliers are not aligned with the farmer’s realities. This patent-pending method of manufacturing low-cost industrial robots and electric vehicles has met its objective. To create low-cost machines, the farmer can assemble, program, and repair with basic hand tools.

Keywords: automation, robotics, utility robot, small-hold farm, robot operating system

Procedia PDF Downloads 60