Search results for: R data science
Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 26774

Search results for: R data science

25514 Integrated On-Board Diagnostic-II and Direct Controller Area Network Access for Vehicle Monitoring System

Authors: Kavian Khosravinia, Mohd Khair Hassan, Ribhan Zafira Abdul Rahman, Syed Abdul Rahman Al-Haddad

Abstract:

The CAN (controller area network) bus is introduced as a multi-master, message broadcast system. The messages sent on the CAN are used to communicate state information, referred as a signal between different ECUs, which provides data consistency in every node of the system. OBD-II Dongles that are based on request and response method is the wide-spread solution for extracting sensor data from cars among researchers. Unfortunately, most of the past researches do not consider resolution and quantity of their input data extracted through OBD-II technology. The maximum feasible scan rate is only 9 queries per second which provide 8 data points per second with using ELM327 as well-known OBD-II dongle. This study aims to develop and design a programmable, and latency-sensitive vehicle data acquisition system that improves the modularity and flexibility to extract exact, trustworthy, and fresh car sensor data with higher frequency rates. Furthermore, the researcher must break apart, thoroughly inspect, and observe the internal network of the vehicle, which may cause severe damages to the expensive ECUs of the vehicle due to intrinsic vulnerabilities of the CAN bus during initial research. Desired sensors data were collected from various vehicles utilizing Raspberry Pi3 as computing and processing unit with using OBD (request-response) and direct CAN method at the same time. Two types of data were collected for this study. The first, CAN bus frame data that illustrates data collected for each line of hex data sent from an ECU and the second type is the OBD data that represents some limited data that is requested from ECU under standard condition. The proposed system is reconfigurable, human-readable and multi-task telematics device that can be fitted into any vehicle with minimum effort and minimum time lag in the data extraction process. The standard operational procedure experimental vehicle network test bench is developed and can be used for future vehicle network testing experiment.

Keywords: CAN bus, OBD-II, vehicle data acquisition, connected cars, telemetry, Raspberry Pi3

Procedia PDF Downloads 204
25513 Big Data in Construction Project Management: The Colombian Northeast Case

Authors: Sergio Zabala-Vargas, Miguel Jiménez-Barrera, Luz VArgas-Sánchez

Abstract:

In recent years, information related to project management in organizations has been increasing exponentially. Performance data, management statistics, indicator results have forced the collection, analysis, traceability, and dissemination of project managers to be essential. In this sense, there are current trends to facilitate efficient decision-making in emerging technology projects, such as: Machine Learning, Data Analytics, Data Mining, and Big Data. The latter is the most interesting in this project. This research is part of the thematic line Construction methods and project management. Many authors present the relevance that the use of emerging technologies, such as Big Data, has taken in recent years in project management in the construction sector. The main focus is the optimization of time, scope, budget, and in general mitigating risks. This research was developed in the northeastern region of Colombia-South America. The first phase was aimed at diagnosing the use of emerging technologies (Big-Data) in the construction sector. In Colombia, the construction sector represents more than 50% of the productive system, and more than 2 million people participate in this economic segment. The quantitative approach was used. A survey was applied to a sample of 91 companies in the construction sector. Preliminary results indicate that the use of Big Data and other emerging technologies is very low and also that there is interest in modernizing project management. There is evidence of a correlation between the interest in using new data management technologies and the incorporation of Building Information Modeling BIM. The next phase of the research will allow the generation of guidelines and strategies for the incorporation of technological tools in the construction sector in Colombia.

Keywords: big data, building information modeling, tecnology, project manamegent

Procedia PDF Downloads 128
25512 Minimum Data of a Speech Signal as Special Indicators of Identification in Phonoscopy

Authors: Nazaket Gazieva

Abstract:

Voice biometric data associated with physiological, psychological and other factors are widely used in forensic phonoscopy. There are various methods for identifying and verifying a person by voice. This article explores the minimum speech signal data as individual parameters of a speech signal. Monozygotic twins are believed to be genetically identical. Using the minimum data of the speech signal, we came to the conclusion that the voice imprint of monozygotic twins is individual. According to the conclusion of the experiment, we can conclude that the minimum indicators of the speech signal are more stable and reliable for phonoscopic examinations.

Keywords: phonogram, speech signal, temporal characteristics, fundamental frequency, biometric fingerprints

Procedia PDF Downloads 144
25511 Improving Perceptual Reasoning in School Children through Chess Training

Authors: Ebenezer Joseph, Veena Easvaradoss, S. Sundar Manoharan, David Chandran, Sumathi Chandrasekaran, T. R. Uma

Abstract:

Perceptual reasoning is the ability that incorporates fluid reasoning, spatial processing, and visual motor integration. Several theories of cognitive functioning emphasize the importance of fluid reasoning. The ability to manipulate abstractions and rules and to generalize is required for reasoning tasks. This study, funded by the Cognitive Science Research Initiative, Department of Science and Technology, Government of India, analyzed the effect of 1-year chess training on the perceptual reasoning of children. A pretest–posttest with control group design was used, with 43 (28 boys, 15 girls) children in the experimental group and 42 (26 boys, 16 girls) children in the control group. The sample was selected from children studying in two private schools from South India (grades 3 to 9), which included both the genders. The experimental group underwent weekly 1-hour chess training for 1 year. Perceptual reasoning was measured by three subtests of WISC-IV INDIA. Pre-equivalence of means was established. Further statistical analyses revealed that the experimental group had shown statistically significant improvement in perceptual reasoning compared to the control group. The present study clearly establishes a correlation between chess learning and perceptual reasoning. If perceptual reasoning can be enhanced in children, it could possibly result in the improvement of executive functions as well as the scholastic performance of the child.

Keywords: chess, cognition, intelligence, perceptual reasoning

Procedia PDF Downloads 356
25510 A Non-parametric Clustering Approach for Multivariate Geostatistical Data

Authors: Francky Fouedjio

Abstract:

Multivariate geostatistical data have become omnipresent in the geosciences and pose substantial analysis challenges. One of them is the grouping of data locations into spatially contiguous clusters so that data locations within the same cluster are more similar while clusters are different from each other, in some sense. Spatially contiguous clusters can significantly improve the interpretation that turns the resulting clusters into meaningful geographical subregions. In this paper, we develop an agglomerative hierarchical clustering approach that takes into account the spatial dependency between observations. It relies on a dissimilarity matrix built from a non-parametric kernel estimator of the spatial dependence structure of data. It integrates existing methods to find the optimal cluster number and to evaluate the contribution of variables to the clustering. The capability of the proposed approach to provide spatially compact, connected and meaningful clusters is assessed using bivariate synthetic dataset and multivariate geochemical dataset. The proposed clustering method gives satisfactory results compared to other similar geostatistical clustering methods.

Keywords: clustering, geostatistics, multivariate data, non-parametric

Procedia PDF Downloads 477
25509 A Data Mining Approach for Analysing and Predicting the Bank's Asset Liability Management Based on Basel III Norms

Authors: Nidhin Dani Abraham, T. K. Sri Shilpa

Abstract:

Asset liability management is an important aspect in banking business. Moreover, the today’s banking is based on BASEL III which strictly regulates on the counterparty default. This paper focuses on prediction and analysis of counter party default risk, which is a type of risk occurs when the customers fail to repay the amount back to the lender (bank or any financial institutions). This paper proposes an approach to reduce the counterparty risk occurring in the financial institutions using an appropriate data mining technique and thus predicts the occurrence of NPA. It also helps in asset building and restructuring quality. Liability management is very important to carry out banking business. To know and analyze the depth of liability of bank, a suitable technique is required. For that a data mining technique is being used to predict the dormant behaviour of various deposit bank customers. Various models are implemented and the results are analyzed of saving bank deposit customers. All these data are cleaned using data cleansing approach from the bank data warehouse.

Keywords: data mining, asset liability management, BASEL III, banking

Procedia PDF Downloads 552
25508 Parallel Coordinates on a Spiral Surface for Visualizing High-Dimensional Data

Authors: Chris Suma, Yingcai Xiao

Abstract:

This paper presents Parallel Coordinates on a Spiral Surface (PCoSS), a parallel coordinate based interactive visualization method for high-dimensional data, and a test implementation of the method. Plots generated by the test system are compared with those generated by XDAT, a software implementing traditional parallel coordinates. Traditional parallel coordinate plots can be cluttered when the number of data points is large or when the dimensionality of the data is high. PCoSS plots display multivariate data on a 3D spiral surface and allow users to see the whole picture of high-dimensional data with less cluttering. Taking advantage of the 3D display environment in PCoSS, users can further reduce cluttering by zooming into an axis of interest for a closer view or by moving vantage points and by reorienting the viewing angle to obtain a desired view of the plots.

Keywords: human computer interaction, parallel coordinates, spiral surface, visualization

Procedia PDF Downloads 11
25507 A Dynamic Ensemble Learning Approach for Online Anomaly Detection in Alibaba Datacenters

Authors: Wanyi Zhu, Xia Ming, Huafeng Wang, Junda Chen, Lu Liu, Jiangwei Jiang, Guohua Liu

Abstract:

Anomaly detection is a first and imperative step needed to respond to unexpected problems and to assure high performance and security in large data center management. This paper presents an online anomaly detection system through an innovative approach of ensemble machine learning and adaptive differentiation algorithms, and applies them to performance data collected from a continuous monitoring system for multi-tier web applications running in Alibaba data centers. We evaluate the effectiveness and efficiency of this algorithm with production traffic data and compare with the traditional anomaly detection approaches such as a static threshold and other deviation-based detection techniques. The experiment results show that our algorithm correctly identifies the unexpected performance variances of any running application, with an acceptable false positive rate. This proposed approach has already been deployed in real-time production environments to enhance the efficiency and stability in daily data center operations.

Keywords: Alibaba data centers, anomaly detection, big data computation, dynamic ensemble learning

Procedia PDF Downloads 201
25506 Performance Tests of Wood Glues on Different Wood Species Used in Wood Workshops: Morogoro Tanzania

Authors: Japhet N. Mwambusi

Abstract:

High tropical forests deforestation for solid wood furniture industry is among of climate change contributing agents. This pressure indirectly is caused by furniture joints failure due to poor gluing technology based on improper use of different glues to different wood species which lead to low quality and weak wood-glue joints. This study was carried in order to run performance tests of wood glues on different wood species used in wood workshops: Morogoro Tanzania whereby three popular wood species of C. lusitanica, T. glandis and E. maidenii were tested against five glues of Woodfix, Bullbond, Ponal, Fevicol and Coral found in the market. The findings were necessary on developing a guideline for proper glue selection for a particular wood species joining. Random sampling was employed to interview carpenters while conducting a survey on the background of carpenters like their education level and to determine factors that influence their glues choice. Monsanto Tensiometer was used to determine bonding strength of identified wood glues to different wood species in use under British Standard of testing wood shear strength (BS EN 205) procedures. Data obtained from interviewing carpenters were analyzed through Statistical Package of Social Science software (SPSS) to allow the comparison of different data while laboratory data were compiled, related and compared by the use of MS Excel worksheet software as well as Analysis of Variance (ANOVA). Results revealed that among all five wood glues tested in the laboratory to three different wood species, Coral performed much better with the average shear strength 4.18 N/mm2, 3.23 N/mm2 and 5.42 N/mm2 for Cypress, Teak and Eucalyptus respectively. This displays that for a strong joint to be formed to all tree wood species for soft wood and hard wood, Coral has a first priority in use. The developed table of guideline from this research can be useful to carpenters on proper glue selection to a particular wood species so as to meet glue-bond strength. This will secure furniture market as well as reduce pressure to the forests for furniture production because of the strong existing furniture due to their strong joints. Indeed, this can be a good strategy on reducing climate change speed in tropics which result from high deforestation of trees for furniture production.

Keywords: climate change, deforestation, gluing technology, joint failure, wood-glue, wood species

Procedia PDF Downloads 240
25505 Unsupervised Text Mining Approach to Early Warning System

Authors: Ichihan Tai, Bill Olson, Paul Blessner

Abstract:

Traditional early warning systems that alarm against crisis are generally based on structured or numerical data; therefore, a system that can make predictions based on unstructured textual data, an uncorrelated data source, is a great complement to the traditional early warning systems. The Chicago Board Options Exchange (CBOE) Volatility Index (VIX), commonly referred to as the fear index, measures the cost of insurance against market crash, and spikes in the event of crisis. In this study, news data is consumed for prediction of whether there will be a market-wide crisis by predicting the movement of the fear index, and the historical references to similar events are presented in an unsupervised manner. Topic modeling-based prediction and representation are made based on daily news data between 1990 and 2015 from The Wall Street Journal against VIX index data from CBOE.

Keywords: early warning system, knowledge management, market prediction, topic modeling.

Procedia PDF Downloads 338
25504 The Role of Synthetic Data in Aerial Object Detection

Authors: Ava Dodd, Jonathan Adams

Abstract:

The purpose of this study is to explore the characteristics of developing a machine learning application using synthetic data. The study is structured to develop the application for the purpose of deploying the computer vision model. The findings discuss the realities of attempting to develop a computer vision model for practical purpose, and detail the processes, tools, and techniques that were used to meet accuracy requirements. The research reveals that synthetic data represents another variable that can be adjusted to improve the performance of a computer vision model. Further, a suite of tools and tuning recommendations are provided.

Keywords: computer vision, machine learning, synthetic data, YOLOv4

Procedia PDF Downloads 225
25503 Perception-Oriented Model Driven Development for Designing Data Acquisition Process in Wireless Sensor Networks

Authors: K. Indra Gandhi

Abstract:

Wireless Sensor Networks (WSNs) have always been characterized for application-specific sensing, relaying and collection of information for further analysis. However, software development was not considered as a separate entity in this process of data collection which has posed severe limitations on the software development for WSN. Software development for WSN is a complex process since the components involved are data-driven, network-driven and application-driven in nature. This implies that there is a tremendous need for the separation of concern from the software development perspective. A layered approach for developing data acquisition design based on Model Driven Development (MDD) has been proposed as the sensed data collection process itself varies depending upon the application taken into consideration. This work focuses on the layered view of the data acquisition process so as to ease the software point of development. A metamodel has been proposed that enables reusability and realization of the software development as an adaptable component for WSN systems. Further, observing users perception indicates that proposed model helps in improving the programmer's productivity by realizing the collaborative system involved.

Keywords: data acquisition, model-driven development, separation of concern, wireless sensor networks

Procedia PDF Downloads 434
25502 Comparative Analysis of Data Gathering Protocols with Multiple Mobile Elements for Wireless Sensor Network

Authors: Bhat Geetalaxmi Jairam, D. V. Ashoka

Abstract:

Wireless Sensor Networks are used in many applications to collect sensed data from different sources. Sensed data has to be delivered through sensors wireless interface using multi-hop communication towards the sink. The data collection in wireless sensor networks consumes energy. Energy consumption is the major constraints in WSN .Reducing the energy consumption while increasing the amount of generated data is a great challenge. In this paper, we have implemented two data gathering protocols with multiple mobile sinks/elements to collect data from sensor nodes. First, is Energy-Efficient Data Gathering with Tour Length-Constrained Mobile Elements in Wireless Sensor Networks (EEDG), in which mobile sinks uses vehicle routing protocol to collect data. Second is An Intelligent Agent-based Routing Structure for Mobile Sinks in WSNs (IAR), in which mobile sinks uses prim’s algorithm to collect data. Authors have implemented concepts which are common to both protocols like deployment of mobile sinks, generating visiting schedule, collecting data from the cluster member. Authors have compared the performance of both protocols by taking statistics based on performance parameters like Delay, Packet Drop, Packet Delivery Ratio, Energy Available, Control Overhead. Authors have concluded this paper by proving EEDG is more efficient than IAR protocol but with few limitations which include unaddressed issues likes Redundancy removal, Idle listening, Mobile Sink’s pause/wait state at the node. In future work, we plan to concentrate more on these limitations to avail a new energy efficient protocol which will help in improving the life time of the WSN.

Keywords: aggregation, consumption, data gathering, efficiency

Procedia PDF Downloads 497
25501 Status and Results from EXO-200

Authors: Ryan Maclellan

Abstract:

EXO-200 has provided one of the most sensitive searches for neutrinoless double-beta decay utilizing 175 kg of enriched liquid xenon in an ultra-low background time projection chamber. This detector has demonstrated excellent energy resolution and background rejection capabilities. Using the first two years of data, EXO-200 has set a limit of 1.1x10^25 years at 90% C.L. on the neutrinoless double-beta decay half-life of Xe-136. The experiment has experienced a brief hiatus in data taking during a temporary shutdown of its host facility: the Waste Isolation Pilot Plant. EXO-200 expects to resume data taking in earnest this fall with upgraded detector electronics. Results from the analysis of EXO-200 data and an update on the current status of EXO-200 will be presented.

Keywords: double-beta, Majorana, neutrino, neutrinoless

Procedia PDF Downloads 414
25500 The Effects of Training Load on Some Selected Fitness Variables in the Case of U-17 Female Volleyball Project Players, Central Ethiopia

Authors: Behailu Shigute Habtemariam

Abstract:

The aim of the study was to examine the effects of training load on some selected fitness performance variables of volleyball players in the case of U-17 female volleyball project players in the central Ethiopia region. Methods: In this study, quasi-experimental design was used. For the purpose of this study, twenty-three volleyball players were used as a subject from the two projects. The data collected through fitness performance assessment were analyzed and interpreted into a meaningful idea using manually as well as with computer in order to compare physical fitness variables and changes observed among participants. Those are taking part in the effects of training load on some selected physical fitness variables. The collected data were analyzed by means of the Statistical Package for Social Science version (SPSS V 20). The independent t-test was used to show the mean differences between the groups, and the paired T-test was used to compare the mean differences of the pre and post-training within each group. The level of significance will be set at Alfa 0.05. Results: The results are displayed using tables and figures. A significant difference was found in the mean differences of pre-test values (19.7 cm) and post-test values (37.5 cm) of the Durame city project on the flexibility test (MD =17.8 cm, P = 0.00). On the other hand, there was a significant difference in the mean difference of pre-test values of (18 cm) and post-test values (26.3 cm) of the Hosana city project on the flexibility test ( MD = 8.3 cm, P = 0.00). Conclusion: According to the results of the present studies, there were significant improvements from pre to post-test at Durame City and Hosana City projects on agility, flexibility, power, and speed fitness tests. On the other hand, a significant difference was not found before beginning the exercise between the two projects; however, a significant difference was found after 12 weeks of training.

Keywords: overload, performance, training, volleyball

Procedia PDF Downloads 97
25499 Citation Analysis of New Zealand Court Decisions

Authors: Tobias Milz, L. Macpherson, Varvara Vetrova

Abstract:

The law is a fundamental pillar of human societies as it shapes, controls and governs how humans conduct business, behave and interact with each other. Recent advances in computer-assisted technologies such as NLP, data science and AI are creating opportunities to support the practice, research and study of this pervasive domain. It is therefore not surprising that there has been an increase in investments into supporting technologies for the legal industry (also known as “legal tech” or “law tech”) over the last decade. A sub-discipline of particular appeal is concerned with assisted legal research. Supporting law researchers and practitioners to retrieve information from the vast amount of ever-growing legal documentation is of natural interest to the legal research community. One tool that has been in use for this purpose since the early nineteenth century is legal citation indexing. Among other use cases, they provided an effective means to discover new precedent cases. Nowadays, computer-assisted network analysis tools can allow for new and more efficient ways to reveal the “hidden” information that is conveyed through citation behavior. Unfortunately, access to openly available legal data is still lacking in New Zealand and access to such networks is only commercially available via providers such as LexisNexis. Consequently, there is a need to create, analyze and provide a legal citation network with sufficient data to support legal research tasks. This paper describes the development and analysis of a legal citation Network for New Zealand containing over 300.000 decisions from 125 different courts of all areas of law and jurisdiction. Using python, the authors assembled web crawlers, scrapers and an OCR pipeline to collect and convert court decisions from openly available sources such as NZLII into uniform and machine-readable text. This facilitated the use of regular expressions to identify references to other court decisions from within the decision text. The data was then imported into a graph-based database (Neo4j) with the courts and their respective cases represented as nodes and the extracted citations as links. Furthermore, additional links between courts of connected cases were added to indicate an indirect citation between the courts. Neo4j, as a graph-based database, allows efficient querying and use of network algorithms such as PageRank to reveal the most influential/most cited courts and court decisions over time. This paper shows that the in-degree distribution of the New Zealand legal citation network resembles a power-law distribution, which indicates a possible scale-free behavior of the network. This is in line with findings of the respective citation networks of the U.S. Supreme Court, Austria and Germany. The authors of this paper provide the database as an openly available data source to support further legal research. The decision texts can be exported from the database to be used for NLP-related legal research, while the network can be used for in-depth analysis. For example, users of the database can specify the network algorithms and metrics to only include specific courts to filter the results to the area of law of interest.

Keywords: case citation network, citation analysis, network analysis, Neo4j

Procedia PDF Downloads 107
25498 Remaining Useful Life (RUL) Assessment Using Progressive Bearing Degradation Data and ANN Model

Authors: Amit R. Bhende, G. K. Awari

Abstract:

Remaining useful life (RUL) prediction is one of key technologies to realize prognostics and health management that is being widely applied in many industrial systems to ensure high system availability over their life cycles. The present work proposes a data-driven method of RUL prediction based on multiple health state assessment for rolling element bearings. Bearing degradation data at three different conditions from run to failure is used. A RUL prediction model is separately built in each condition. Feed forward back propagation neural network models are developed for prediction modeling.

Keywords: bearing degradation data, remaining useful life (RUL), back propagation, prognosis

Procedia PDF Downloads 436
25497 Spatio-Temporal Data Mining with Association Rules for Lake Van

Authors: Tolga Aydin, M. Fatih Alaeddinoğlu

Abstract:

People, throughout the history, have made estimates and inferences about the future by using their past experiences. Developing information technologies and the improvements in the database management systems make it possible to extract useful information from knowledge in hand for the strategic decisions. Therefore, different methods have been developed. Data mining by association rules learning is one of such methods. Apriori algorithm, one of the well-known association rules learning algorithms, is not commonly used in spatio-temporal data sets. However, it is possible to embed time and space features into the data sets and make Apriori algorithm a suitable data mining technique for learning spatio-temporal association rules. Lake Van, the largest lake of Turkey, is a closed basin. This feature causes the volume of the lake to increase or decrease as a result of change in water amount it holds. In this study, evaporation, humidity, lake altitude, amount of rainfall and temperature parameters recorded in Lake Van region throughout the years are used by the Apriori algorithm and a spatio-temporal data mining application is developed to identify overflows and newly-formed soil regions (underflows) occurring in the coastal parts of Lake Van. Identifying possible reasons of overflows and underflows may be used to alert the experts to take precautions and make the necessary investments.

Keywords: apriori algorithm, association rules, data mining, spatio-temporal data

Procedia PDF Downloads 374
25496 Building Data Infrastructure for Public Use and Informed Decision Making in Developing Countries-Nigeria

Authors: Busayo Fashoto, Abdulhakeem Shaibu, Justice Agbadu, Samuel Aiyeoribe

Abstract:

Data has gone from just rows and columns to being an infrastructure itself. The traditional medium of data infrastructure has been managed by individuals in different industries and saved on personal work tools; one of such is the laptop. This hinders data sharing and Sustainable Development Goal (SDG) 9 for infrastructure sustainability across all countries and regions. However, there has been a constant demand for data across different agencies and ministries by investors and decision-makers. The rapid development and adoption of open-source technologies that promote the collection and processing of data in new ways and in ever-increasing volumes are creating new data infrastructure in sectors such as lands and health, among others. This paper examines the process of developing data infrastructure and, by extension, a data portal to provide baseline data for sustainable development and decision making in Nigeria. This paper employs the FAIR principle (Findable, Accessible, Interoperable, and Reusable) of data management using open-source technology tools to develop data portals for public use. eHealth Africa, an organization that uses technology to drive public health interventions in Nigeria, developed a data portal which is a typical data infrastructure that serves as a repository for various datasets on administrative boundaries, points of interest, settlements, social infrastructure, amenities, and others. This portal makes it possible for users to have access to datasets of interest at any point in time at no cost. A skeletal infrastructure of this data portal encompasses the use of open-source technology such as Postgres database, GeoServer, GeoNetwork, and CKan. These tools made the infrastructure sustainable, thus promoting the achievement of SDG 9 (Industries, Innovation, and Infrastructure). As of 6th August 2021, a wider cross-section of 8192 users had been created, 2262 datasets had been downloaded, and 817 maps had been created from the platform. This paper shows the use of rapid development and adoption of technologies that facilitates data collection, processing, and publishing in new ways and in ever-increasing volumes. In addition, the paper is explicit on new data infrastructure in sectors such as health, social amenities, and agriculture. Furthermore, this paper reveals the importance of cross-sectional data infrastructures for planning and decision making, which in turn can form a central data repository for sustainable development across developing countries.

Keywords: data portal, data infrastructure, open source, sustainability

Procedia PDF Downloads 98
25495 Process Data-Driven Representation of Abnormalities for Efficient Process Control

Authors: Hyun-Woo Cho

Abstract:

Unexpected operational events or abnormalities of industrial processes have a serious impact on the quality of final product of interest. In terms of statistical process control, fault detection and diagnosis of processes is one of the essential tasks needed to run the process safely. In this work, nonlinear representation of process measurement data is presented and evaluated using a simulation process. The effect of using different representation methods on the diagnosis performance is tested in terms of computational efficiency and data handling. The results have shown that the nonlinear representation technique produced more reliable diagnosis results and outperforms linear methods. The use of data filtering step improved computational speed and diagnosis performance for test data sets. The presented scheme is different from existing ones in that it attempts to extract the fault pattern in the reduced space, not in the original process variable space. Thus this scheme helps to reduce the sensitivity of empirical models to noise.

Keywords: fault diagnosis, nonlinear technique, process data, reduced spaces

Procedia PDF Downloads 247
25494 Text-to-Speech in Azerbaijani Language via Transfer Learning in a Low Resource Environment

Authors: Dzhavidan Zeinalov, Bugra Sen, Firangiz Aslanova

Abstract:

Most text-to-speech models cannot operate well in low-resource languages and require a great amount of high-quality training data to be considered good enough. Yet, with the improvements made in ASR systems, it is now much easier than ever to collect data for the design of custom text-to-speech models. In this work, our work on using the ASR model to collect data to build a viable text-to-speech system for one of the leading financial institutions of Azerbaijan will be outlined. NVIDIA’s implementation of the Tacotron 2 model was utilized along with the HiFiGAN vocoder. As for the training, the model was first trained with high-quality audio data collected from the Internet, then fine-tuned on the bank’s single speaker call center data. The results were then evaluated by 50 different listeners and got a mean opinion score of 4.17, displaying that our method is indeed viable. With this, we have successfully designed the first text-to-speech model in Azerbaijani and publicly shared 12 hours of audiobook data for everyone to use.

Keywords: Azerbaijani language, HiFiGAN, Tacotron 2, text-to-speech, transfer learning, whisper

Procedia PDF Downloads 44
25493 An Empirical Evaluation of Performance of Machine Learning Techniques on Imbalanced Software Quality Data

Authors: Ruchika Malhotra, Megha Khanna

Abstract:

The development of change prediction models can help the software practitioners in planning testing and inspection resources at early phases of software development. However, a major challenge faced during the training process of any classification model is the imbalanced nature of the software quality data. A data with very few minority outcome categories leads to inefficient learning process and a classification model developed from the imbalanced data generally does not predict these minority categories correctly. Thus, for a given dataset, a minority of classes may be change prone whereas a majority of classes may be non-change prone. This study explores various alternatives for adeptly handling the imbalanced software quality data using different sampling methods and effective MetaCost learners. The study also analyzes and justifies the use of different performance metrics while dealing with the imbalanced data. In order to empirically validate different alternatives, the study uses change data from three application packages of open-source Android data set and evaluates the performance of six different machine learning techniques. The results of the study indicate extensive improvement in the performance of the classification models when using resampling method and robust performance measures.

Keywords: change proneness, empirical validation, imbalanced learning, machine learning techniques, object-oriented metrics

Procedia PDF Downloads 418
25492 An International Curriculum Development for Languages and Technology

Authors: Miguel Nino

Abstract:

When considering the challenges of a changing and demanding globalizing world, it is important to reflect on how university students will be prepared for the realities of internationalization, marketization and intercultural conversation. The present study is an interdisciplinary program designed to respond to the needs of the global community. The proposal bridges the humanities and science through three different fields: Languages, graphic design and computer science, specifically, fundamentals of programming such as python, java script and software animation. Therefore, the goal of the four year program is twofold: First, enable students for intercultural communication between English and other languages such as Spanish, Mandarin, French or German. Second, students will acquire knowledge in practical software and relevant employable skills to collaborate in assisted computer projects that most probable will require essential programing background in interpreted or compiled languages. In order to become inclusive and constructivist, the cognitive linguistics approach is suggested for the three different fields, particularly for languages that rely on the traditional method of repetition. This methodology will help students develop their creativity and encourage them to become independent problem solving individuals, as languages enhance their common ground of interaction for culture and technology. Participants in this course of study will be evaluated in their second language acquisition at the Intermediate-High level. For graphic design and computer science students will apply their creative digital skills, as well as their critical thinking skills learned from the cognitive linguistics approach, to collaborate on a group project design to find solutions for media web design problems or marketing experimentation for a company or the community. It is understood that it will be necessary to apply programming knowledge and skills to deliver the final product. In conclusion, the program equips students with linguistics knowledge and skills to be competent in intercultural communication, where English, the lingua franca, remains the medium for marketing and product delivery. In addition to their employability, students can expand their knowledge and skills in digital humanities, computational linguistics, or increase their portfolio in advertising and marketing. These students will be the global human capital for the competitive globalizing community.

Keywords: curriculum, international, languages, technology

Procedia PDF Downloads 443
25491 Effects of Computer Aided Instructional Package on Performance and Retention of Genetic Concepts amongst Secondary School Students in Niger State, Nigeria

Authors: Muhammad R. Bello, Mamman A. Wasagu, Yahya M. Kamar

Abstract:

The study investigated the effects of computer-aided instructional package (CAIP) on performance and retention of genetic concepts among secondary school students in Niger State. Quasi-experimental research design i.e. pre-test-post-test experimental and control groups were adopted for the study. The population of the study was all senior secondary school three (SS3) students’ offering biology. A sample of 223 students was randomly drawn from six purposively selected secondary schools. The researchers’ developed computer aided instructional package (CAIP) on genetic concepts was used as treatment instrument for the experimental group while the control group was exposed to the conventional lecture method (CLM). The instrument for data collection was a Genetic Performance Test (GEPET) that had 50 multiple-choice questions which were validated by science educators. A Reliability coefficient of 0.92 was obtained for GEPET using Pearson Product Moment Correlation (PPMC). The data collected were analyzed using IBM SPSS Version 20 package for computation of Means, Standard deviation, t-test, and analysis of covariance (ANCOVA). The ANOVA analysis (Fcal (220) = 27.147, P < 0.05) shows that students who received instruction with CAIP outperformed the students who received instruction with CLM and also had higher retention. The findings also revealed no significant difference in performance and retention between male and female students (tcal (103) = -1.429, P > 0.05). It was recommended amongst others that teachers should use computer-aided instructional package in teaching genetic concepts in order to improve students’ performance and retention in biology subject. Keywords: Computer-aided Instructional Package, Performance, Retention and Genetic Concepts.

Keywords: computer aided instructional package, performance, retention, genetic concepts, senior secondary school students

Procedia PDF Downloads 362
25490 Variance-Aware Routing and Authentication Scheme for Harvesting Data in Cloud-Centric Wireless Sensor Networks

Authors: Olakanmi Oladayo Olufemi, Bamifewe Olusegun James, Badmus Yaya Opeyemi, Adegoke Kayode

Abstract:

The wireless sensor network (WSN) has made a significant contribution to the emergence of various intelligent services or cloud-based applications. Most of the time, these data are stored on a cloud platform for efficient management and sharing among different services or users. However, the sensitivity of the data makes them prone to various confidentiality and performance-related attacks during and after harvesting. Various security schemes have been developed to ensure the integrity and confidentiality of the WSNs' data. However, their specificity towards particular attacks and the resource constraint and heterogeneity of WSNs make most of these schemes imperfect. In this paper, we propose a secure variance-aware routing and authentication scheme with two-tier verification to collect, share, and manage WSN data. The scheme is capable of classifying WSN into different subnets, detecting any attempt of wormhole and black hole attack during harvesting, and enforcing access control on the harvested data stored in the cloud. The results of the analysis showed that the proposed scheme has more security functionalities than other related schemes, solves most of the WSNs and cloud security issues, prevents wormhole and black hole attacks, identifies the attackers during data harvesting, and enforces access control on the harvested data stored in the cloud at low computational, storage, and communication overheads.

Keywords: data block, heterogeneous IoT network, data harvesting, wormhole attack, blackhole attack access control

Procedia PDF Downloads 84
25489 Students’ Perception of Effort and Emotional Costs in Chemistry Courses

Authors: Guizella Rocabado, Cassidy Wilkes

Abstract:

It is well known that chemistry is one of the most feared courses in college. Although many students enjoy learning about science, most of them perceive that chemistry is “too difficult”. These perceptions of chemistry result in many students not considering Science, Technology, Engineering, and Mathematics (STEM) majors because they require chemistry courses. Ultimately, these perceptions are also thought to be related to high attrition rates of students who begin STEM majors but do not persist. Students perceived costs of a chemistry class can be many, such as task effort, loss of valued alternatives, emotional, and others. These costs might be overcome by students’ interests and goals, yet the level of perceived costs might have a lasting impact on the students’ overall perception of chemistry and their desire to pursue chemistry and other STEM careers in the future. In this mixed methods study, we investigated task effort and emotional cost, as well as a mastery or performance goal orientation, and the impact these constructs may have on achievement in general chemistry classrooms. Utilizing cluster analysis as well as student interviews, we investigated students’ profiles of perceived cost and goal orientation as it relates to their final grades. Our results show that students who are well prepared for general chemistry, such as those who have taken chemistry in high school, display less negative perceived costs and thus believe they can master the material more fully. Other interesting results have also emerged from this research, which has the potential to have an impact on future instruction of these courses.

Keywords: chemistry education, motivation, affect, perceived costs, goal orientations

Procedia PDF Downloads 91
25488 An Efficient and Low Cost Protocol for Rapid and Mass in vitro Propagation of Hyssopus officinalis L.

Authors: Ira V. Stancheva, Ely G. Zayova, Maria P. Geneva, Marieta G. Hristozkova, Lyudmila I. Dimitrova, Maria I. Petrova

Abstract:

The study describes a highly efficient and low-cost protocol for rapid and mass in vitro propagation of medicinal and aromatic plant species (Hyssopus officinalis L., Lamiaceae). Hyssop is an important aromatic herb used for its medicinal values because of its antioxidant, anti-inflammatory and antimicrobial properties. The protocol for large-scale multiplication of this aromatic plant was developed using young stem tips explants. The explants were sterilized with 0.04% mercuric chloride (HgCl₂) solution for 20 minutes and washing three times with sterile distilled water in 15 minutes. The cultural media was full and half strength Murashige and Skoog medium containing indole-3-butyric acid. Full and ½ Murashige and Skoog media without auxin were used as controls. For each variant 20 glass tubes with two plants were used. In each tube two tip and nodal explants were inoculated. Maximum shoot and root number were obtained on ½ Murashige and Skoog medium supplemented with 0.1 mg L-1 indole-3-butyric acid at the same time after four weeks of culture. The number of shoots per explant and shoot height were considered. The data on rooting percentage, the number of roots per plant and root length were collected after the same cultural period. The highest percentage of survival 85% for this medicinal plant was recorded in mixture of soil, sand and perlite (2:1:1 v/v/v). This mixture was most suitable for acclimatization of all propagated plants. Ex vitro acclimatization was carried out at 24±1 °C and 70% relative humidity under 16 h illuminations (50 μmol m⁻²s⁻¹). After adaptation period, the all plants were transferred to the field. The plants flowered within three months after transplantation. Phenotypic variations in the acclimatized plants were not observed. An average of 90% of the acclimatized plants survived after transferring into the field. All the in vitro propagated plants displayed normal development under the field conditions. Developed in vitro techniques could provide a promising alternative tool for large-scale propagation that increases the number of homologous plants for field cultivation. Acknowledgments: This study was conducted with financial support from National Science Fund at the Bulgarian Ministry of Education and Science, Project DN06/7 17.12.16.

Keywords: Hyssopus officinalis L., in vitro culture, micro propagation, acclimatization

Procedia PDF Downloads 311
25487 Quality of Age Reporting from Tanzania 2012 Census Results: An Assessment Using Whipple’s Index, Myer’s Blended Index, and Age-Sex Accuracy Index

Authors: A. Sathiya Susuman, Hamisi F. Hamisi

Abstract:

Background: Many socio-economic and demographic data are age-sex attributed. However, a variety of irregularities and misstatement are noted with respect to age-related data and less to sex data because of its biological differences between the genders. Noting the misstatement/misreporting of age data regardless of its significance importance in demographics and epidemiological studies, this study aims at assessing the quality of 2012 Tanzania Population and Housing Census Results. Methods: Data for the analysis are downloaded from Tanzania National Bureau of Statistics. Age heaping and digit preference were measured using summary indices viz., Whipple’s index, Myers’ blended index, and Age-Sex Accuracy index. Results: The recorded Whipple’s index for both sexes was 154.43; male has the lowest index of about 152.65 while female has the highest index of about 156.07. For Myers’ blended index, the preferences were at digits ‘0’ and ‘5’ while avoidance were at digits ‘1’ and ‘3’ for both sexes. Finally, Age-sex index stood at 59.8 where sex ratio score was 5.82 and age ratio scores were 20.89 and 21.4 for males and female respectively. Conclusion: The evaluation of the 2012 PHC data using the demographic techniques has qualified the data inaccurate as the results of systematic heaping and digit preferences/avoidances. Thus, innovative methods in data collection along with measuring and minimizing errors using statistical techniques should be used to ensure accuracy of age data.

Keywords: age heaping, digit preference/avoidance, summary indices, Whipple’s index, Myer’s index, age-sex accuracy index

Procedia PDF Downloads 476
25486 Intelligent Quality Management System on the Example оf Bread Baking

Authors: Irbulat Utepbergenov, Lyazzat Issabekova, Shara Toybayeva

Abstract:

This article discusses quality management using the bread baking process as an example. The baking process must be strictly controlled and repeatable. Automation and monitoring of quality management systems can help. After baking bread, quality control of the finished product should be carried out. This may include an evaluation of appearance, weight, texture, and flavor. It is important to continuously work to improve processes and products based on data and feedback from the quality management system. A method and model of automated quality management and an intelligent automated management system based on intelligent technologies are proposed, which allow to automate the processes of QMS implementation and support and improve the validity, efficiency, and effectiveness of management decisions by automating a number of functions of decision makers and staff. This project is supported by the grant of the Ministry of Education and Science of the Republic of Kazakhstan (Zhas Galym project No. AR 13268939 Research and development of digital technologies to ensure consistency of the carriers of normative documents of the quality management system).

Keywords: automated control system, quality management, efficiency evaluation, bakery oven, intelligent system

Procedia PDF Downloads 38
25485 Metal Ship and Robotic Car: A Hands-On Activity to Develop Scientific and Engineering Skills for High School Students

Authors: Jutharat Sunprasert, Ekapong Hirunsirisawat, Narongrit Waraporn, Somporn Peansukmanee

Abstract:

Metal Ship and Robotic Car is one of the hands-on activities in the course, the Fundamental of Engineering that can be divided into three parts. The first part, the metal ships, was made by using engineering drawings, physics and mathematics knowledge. The second part is where the students learned how to construct a robotic car and control it using computer programming. In the last part, the students had to combine the workings of these two objects in the final testing. This aim of study was to investigate the effectiveness of hands-on activity by integrating Science, Technology, Engineering and Mathematics (STEM) concepts to develop scientific and engineering skills. The results showed that the majority of students felt this hands-on activity lead to an increased confidence level in the integration of STEM. Moreover, 48% of all students engaged well with the STEM concepts. Students could obtain the knowledge of STEM through hands-on activities with the topics science and mathematics, engineering drawing, engineering workshop and computer programming; most students agree and strongly agree with this learning process. This indicated that the hands-on activity: “Metal Ship and Robotic Car” is a useful tool to integrate each aspect of STEM. Furthermore, hands-on activities positively influence a student’s interest which leads to increased learning achievement and also in developing scientific and engineering skills.

Keywords: hands-on activity, STEM education, computer programming, metal work

Procedia PDF Downloads 465