Search results for: minimum data set
24399 Automatic Tagging and Accuracy in Assamese Text Data
Authors: Chayanika Hazarika Bordoloi
Abstract:
This paper is an attempt to work on a highly inflectional language called Assamese. This is also one of the national languages of India and very little has been achieved in terms of computational research. Building a language processing tool for a natural language is not very smooth as the standard and language representation change at various levels. This paper presents inflectional suffixes of Assamese verbs and how the statistical tools, along with linguistic features, can improve the tagging accuracy. Conditional random fields (CRF tool) was used to automatically tag and train the text data; however, accuracy was improved after linguistic featured were fed into the training data. Assamese is a highly inflectional language; hence, it is challenging to standardizing its morphology. Inflectional suffixes are used as a feature of the text data. In order to analyze the inflections of Assamese word forms, a list of suffixes is prepared. This list comprises suffixes, comprising of all possible suffixes that various categories can take is prepared. Assamese words can be classified into inflected classes (noun, pronoun, adjective and verb) and un-inflected classes (adverb and particle). The corpus used for this morphological analysis has huge tokens. The corpus is a mixed corpus and it has given satisfactory accuracy. The accuracy rate of the tagger has gradually improved with the modified training data.Keywords: CRF, morphology, tagging, tagset
Procedia PDF Downloads 19424398 Optimization of Bio-Based Mixture of Canarium Luzonicum and Calcium Oxide as Coating Material for Reinforcing Steel Bars
Authors: Charizza D. Montarin, Daryl Jae S. Sigue, Gilford Estores
Abstract:
Philippines was moderately vulnerable to corrosion and to prevent this problem, surface coating should be applied. The main objective of this research was to develop and optimize a bio-based mixture of Pili Resin and Lime as Coating Materials. There are three (3) factors to be considered in choosing the best coating material such as chemical adhesion, friction, and the bearing/shear against the steel bar-concrete interface. Fortunately, both proportions of the Bio-based coating materials (50:50 and 65:35) do not have red rust formation complying with ASTM B117 but failed in terms of ASTM D 3359. Splitting failures of concrete were observed in the Unconfined Reinforced Concrete Samples. All of the steel bars (uncoated and coated) surpassed the Minimum Bond strength (NSCP 2015) about 203% to 285%. The experiments were about 1% to 3% of the results from the ANSYS Simulations with and without Salt Spray Test. Using the bio-based and epoxy coatings, normal splitting strengths were declined. However, there has no significant difference between the results. Thus, the bio-based coating materials can be used as an alternative for the epoxy coating materials and it was highly recommended for Low – Rise Building only.Keywords: Canarium luzonicum, calcium oxide, corrosion, finite element simulations
Procedia PDF Downloads 32324397 A Human Activity Recognition System Based on Sensory Data Related to Object Usage
Authors: M. Abdullah, Al-Wadud
Abstract:
Sensor-based activity recognition systems usually accounts which sensors have been activated to perform an activity. The system then combines the conditional probabilities of those sensors to represent different activities and takes the decision based on that. However, the information about the sensors which are not activated may also be of great help in deciding which activity has been performed. This paper proposes an approach where the sensory data related to both usage and non-usage of objects are utilized to make the classification of activities. Experimental results also show the promising performance of the proposed method.Keywords: Naïve Bayesian, based classification, activity recognition, sensor data, object-usage model
Procedia PDF Downloads 32224396 Application of Post-Stack and Pre-Stack Seismic Inversion for Prediction of Hydrocarbon Reservoirs in a Persian Gulf Gas Field
Authors: Nastaran Moosavi, Mohammad Mokhtari
Abstract:
Seismic inversion is a technique which has been in use for years and its main goal is to estimate and to model physical characteristics of rocks and fluids. Generally, it is a combination of seismic and well-log data. Seismic inversion can be carried out through different methods; we have conducted and compared post-stack and pre- stack seismic inversion methods on real data in one of the fields in the Persian Gulf. Pre-stack seismic inversion can transform seismic data to rock physics such as P-impedance, S-impedance and density. While post- stack seismic inversion can just estimate P-impedance. Then these parameters can be used in reservoir identification. Based on the results of inverting seismic data, a gas reservoir was detected in one of Hydrocarbon oil fields in south of Iran (Persian Gulf). By comparing post stack and pre-stack seismic inversion it can be concluded that the pre-stack seismic inversion provides a more reliable and detailed information for identification and prediction of hydrocarbon reservoirs.Keywords: density, p-impedance, s-impedance, post-stack seismic inversion, pre-stack seismic inversion
Procedia PDF Downloads 32424395 A Data-Driven Monitoring Technique Using Combined Anomaly Detectors
Authors: Fouzi Harrou, Ying Sun, Sofiane Khadraoui
Abstract:
Anomaly detection based on Principal Component Analysis (PCA) was studied intensively and largely applied to multivariate processes with highly cross-correlated process variables. Monitoring metrics such as the Hotelling's T2 and the Q statistics are usually used in PCA-based monitoring to elucidate the pattern variations in the principal and residual subspaces, respectively. However, these metrics are ill suited to detect small faults. In this paper, the Exponentially Weighted Moving Average (EWMA) based on the Q and T statistics, T2-EWMA and Q-EWMA, were developed for detecting faults in the process mean. The performance of the proposed methods was compared with that of the conventional PCA-based fault detection method using synthetic data. The results clearly show the benefit and the effectiveness of the proposed methods over the conventional PCA method, especially for detecting small faults in highly correlated multivariate data.Keywords: data-driven method, process control, anomaly detection, dimensionality reduction
Procedia PDF Downloads 29924394 Antimicrobial Activity of Different Essential Oils in Synergy with Amoxicillin against Clinical Isolates of Methicillin-Resistant Staphylococcus aureus
Authors: Naheed Niaz, Nimra Naeem, Bushra Uzair, Riffat Tahira
Abstract:
Antibacterial activity of different traditional plants essential oils against clinical isolates of Methicillin-resistant Staphylococcus aureus (MRSA) through disk diffusion method was evaluated. All the tested essential oils, in different concentrations, inhibited growth of S. aureus to varying degrees. Cinnamon and Thyme essential oils were observed to be the “best” against test pathogen. Even at lowest concentration of these essential oils i.e. 25 µl/ml, clear zone of inhibition was recorded 9+0.085mm and 8+0.051mm respectively, and at higher concentrations there was a total reduction in growth of MRSA. The study also focused on analyzing the synergistic effects of essential oils in combination with amoxicillin. Results showed that oregano and pennyroyal mint essential oils which were not very effective alone turned out to be strong synergistic enhancers. The activity increased with increase in concentration of the essential oils. It may be concluded from present results that cinnamon and thyme essential oils could be used as potential antimicrobial source for the treatment of infections caused by Methicillin-resistant Staphylococcus aureus (MRSA).Keywords: Staphylococcus aureus, essential oils, antibiotics, combination therapy, minimum inhibitory concentration
Procedia PDF Downloads 44724393 Leveraging Power BI for Advanced Geotechnical Data Analysis and Visualization in Mining Projects
Authors: Elaheh Talebi, Fariba Yavari, Lucy Philip, Lesley Town
Abstract:
The mining industry generates vast amounts of data, necessitating robust data management systems and advanced analytics tools to achieve better decision-making processes in the development of mining production and maintaining safety. This paper highlights the advantages of Power BI, a powerful intelligence tool, over traditional Excel-based approaches for effectively managing and harnessing mining data. Power BI enables professionals to connect and integrate multiple data sources, ensuring real-time access to up-to-date information. Its interactive visualizations and dashboards offer an intuitive interface for exploring and analyzing geotechnical data. Advanced analytics is a collection of data analysis techniques to improve decision-making. Leveraging some of the most complex techniques in data science, advanced analytics is used to do everything from detecting data errors and ensuring data accuracy to directing the development of future project phases. However, while Power BI is a robust tool, specific visualizations required by geotechnical engineers may have limitations. This paper studies the capability to use Python or R programming within the Power BI dashboard to enable advanced analytics, additional functionalities, and customized visualizations. This dashboard provides comprehensive tools for analyzing and visualizing key geotechnical data metrics, including spatial representation on maps, field and lab test results, and subsurface rock and soil characteristics. Advanced visualizations like borehole logs and Stereonet were implemented using Python programming within the Power BI dashboard, enhancing the understanding and communication of geotechnical information. Moreover, the dashboard's flexibility allows for the incorporation of additional data and visualizations based on the project scope and available data, such as pit design, rock fall analyses, rock mass characterization, and drone data. This further enhances the dashboard's usefulness in future projects, including operation, development, closure, and rehabilitation phases. Additionally, this helps in minimizing the necessity of utilizing multiple software programs in projects. This geotechnical dashboard in Power BI serves as a user-friendly solution for analyzing, visualizing, and communicating both new and historical geotechnical data, aiding in informed decision-making and efficient project management throughout various project stages. Its ability to generate dynamic reports and share them with clients in a collaborative manner further enhances decision-making processes and facilitates effective communication within geotechnical projects in the mining industry.Keywords: geotechnical data analysis, power BI, visualization, decision-making, mining industry
Procedia PDF Downloads 9224392 An Image Stitching Approach for Scoliosis Analysis
Authors: Siti Salbiah Samsudin, Hamzah Arof, Ainuddin Wahid Abdul Wahab, Mohd Yamani Idna Idris
Abstract:
Standard X-ray spine images produced by conventional screen-film technique have a limited field of view. This limitation may obstruct a complete inspection of the spine unless images of different parts of the spine are placed next to each other contiguously to form a complete structure. Another solution to producing a whole spine image is by assembling the digitized x-ray images of its parts automatically using image stitching. This paper presents a new Medical Image Stitching (MIS) method that utilizes Minimum Average Correlation Energy (MACE) filters to identify and merge pairs of x-ray medical images. The effectiveness of the proposed method is demonstrated in two sets of experiments involving two databases which contain a total of 40 pairs of overlapping and non-overlapping spine images. The experimental results are compared to those produced by the Normalized Cross Correlation (NCC) and Phase Only Correlation (POC) methods for comparison. It is found that the proposed method outperforms those of the NCC and POC methods in identifying both the overlapping and non-overlapping medical images. The efficacy of the proposed method is further vindicated by its average execution time which is about two to five times shorter than those of the POC and NCC methods.Keywords: image stitching, MACE filter, panorama image, scoliosis
Procedia PDF Downloads 45824391 Modeling, Analysis, and Optimization of Process Parameters of Metal Spinning
Authors: B. Ravi Kumar, S. Gajanana, K. Hemachandra Reddy, K. Udayani
Abstract:
Physically into various derived shapes and sizes under the effect of externally applied forces. The spinning process is an advanced plastic working technology and is frequently used for manufacturing axisymmetric shapes. Over the last few decades, Sheet metal spinning has developed significantly and spun products have widely used in various industries. Nowadays the process has been expanded to new horizons in industries, since tendency to use minimum tool and equipment costs and also using lower forces with the output of excellent surface quality and good mechanical properties. The automation of the process is of greater importance, due to its wider applications like decorative household goods, rocket nose cones, gas cylinders, etc. This paper aims to gain insight into the conventional spinning process by employing experimental and numerical methods. The present work proposes an approach for optimizing process parameters are mandrel speed (rpm), roller nose radius (mm), thickness of the sheet (mm). Forming force, surface roughness and strain are the responses.in spinning of Aluminum (2024-T3) using DOE-Response Surface Methodology (RSM) and Analysis of variance (ANOVA). The FEA software is used for modeling and analysis. The process parameters considered in the experimentation.Keywords: FEA, RSM, process parameters, sheet metal spinning
Procedia PDF Downloads 31924390 An Investigation of E-Government by Using GIS and Establishing E-Government in Developing Countries Case Study: Iraq
Authors: Ahmed M. Jamel
Abstract:
Electronic government initiatives and public participation to them are among the indicators of today's development criteria of the countries. After consequent two wars, Iraq's current position in, for example, UN's e-government ranking is quite concerning and did not improve in recent years, either. In the preparation of this work, we are motivated with the fact that handling geographic data of the public facilities and resources are needed in most of the e-government projects. Geographical information systems (GIS) provide most common tools not only to manage spatial data but also to integrate such type of data with nonspatial attributes of the features. With this background, this paper proposes that establishing a working GIS in the health sector of Iraq would improve e-government applications. As the case study, investigating hospital locations in Erbil is chosen.Keywords: e-government, GIS, Iraq, Erbil
Procedia PDF Downloads 38924389 Biological Control of Sclerotium rolfsii, Damping-off Disease on Centella asiatica
Authors: K. Sunitra, T. Srisuda
Abstract:
Centella asiatica, asiatic pennywort is a medicinal herb plant used widely which held in herbal health care group. The problem of asiatic pennywort production is the outbreak of Sclerotium rolfsii causing a damp-off disease which caused plant stem turn yellowish, finally they begin to die and result in extremely damaging to growers. Therefore, the studies were caried out to control damping off with Trichoderma sp., Bacillus subtilis and fermented banana as compared to the control to suppress with bi-culture under the laboratory condition. It was found that Trichoderma harzianum showed the highest percentage of inbihition, 69.44%. The pot experiments in greenhouse condition showed that chemical had minimum of damping-off (31.54%) and highest yield (1.20 tons/rai) and following by Trichoderma harzianum and Bacillus subtilis treatment. Due to the chemical usage leaving toxic residues on plants and affect the human bodies. Trichoderma harzianum and Bacillus subtilis should be considered as alternatives which have percent of damp-off disease and yields as follows: 45.50 and 43.75%, and 1.12 and 1.09 tons/rai, respectively. These two products are known that they have no health risk for growers and consumers in the future.Keywords: Centella asiatica, Sclerotium rolfsii, Trichoderma harzianum, Bacillus subtilis
Procedia PDF Downloads 30324388 Evaluation of Classification Algorithms for Diagnosis of Asthma in Iranian Patients
Authors: Taha SamadSoltani, Peyman Rezaei Hachesu, Marjan GhaziSaeedi, Maryam Zolnoori
Abstract:
Introduction: Data mining defined as a process to find patterns and relationships along data in the database to build predictive models. Application of data mining extended in vast sectors such as the healthcare services. Medical data mining aims to solve real-world problems in the diagnosis and treatment of diseases. This method applies various techniques and algorithms which have different accuracy and precision. The purpose of this study was to apply knowledge discovery and data mining techniques for the diagnosis of asthma based on patient symptoms and history. Method: Data mining includes several steps and decisions should be made by the user which starts by creation of an understanding of the scope and application of previous knowledge in this area and identifying KD process from the point of view of the stakeholders and finished by acting on discovered knowledge using knowledge conducting, integrating knowledge with other systems and knowledge documenting and reporting.in this study a stepwise methodology followed to achieve a logical outcome. Results: Sensitivity, Specifity and Accuracy of KNN, SVM, Naïve bayes, NN, Classification tree and CN2 algorithms and related similar studies was evaluated and ROC curves were plotted to show the performance of the system. Conclusion: The results show that we can accurately diagnose asthma, approximately ninety percent, based on the demographical and clinical data. The study also showed that the methods based on pattern discovery and data mining have a higher sensitivity compared to expert and knowledge-based systems. On the other hand, medical guidelines and evidence-based medicine should be base of diagnostics methods, therefore recommended to machine learning algorithms used in combination with knowledge-based algorithms.Keywords: asthma, datamining, classification, machine learning
Procedia PDF Downloads 44724387 Application of GPRS in Water Quality Monitoring System
Authors: V. Ayishwarya Bharathi, S. M. Hasker, J. Indhu, M. Mohamed Azarudeen, G. Gowthami, R. Vinoth Rajan, N. Vijayarangan
Abstract:
Identification of water quality conditions in a river system based on limited observations is an essential task for meeting the goals of environmental management. The traditional method of water quality testing is to collect samples manually and then send to laboratory for analysis. However, it has been unable to meet the demands of water quality monitoring today. So a set of automatic measurement and reporting system of water quality has been developed. In this project specifies Water quality parameters collected by multi-parameter water quality probe are transmitted to data processing and monitoring center through GPRS wireless communication network of mobile. The multi parameter sensor is directly placed above the water level. The monitoring center consists of GPRS and micro-controller which monitor the data. The collected data can be monitor at any instant of time. In the pollution control board they will monitor the water quality sensor data in computer using Visual Basic Software. The system collects, transmits and processes water quality parameters automatically, so production efficiency and economy benefit are improved greatly. GPRS technology can achieve well within the complex environment of poor water quality non-monitored, and more specifically applicable to the collection point, data transmission automatically generate the field of water analysis equipment data transmission and monitoring.Keywords: multiparameter sensor, GPRS, visual basic software, RS232
Procedia PDF Downloads 41224386 Decision Support System in Air Pollution Using Data Mining
Authors: E. Fathallahi Aghdam, V. Hosseini
Abstract:
Environmental pollution is not limited to a specific region or country; that is why sustainable development, as a necessary process for improvement, pays attention to issues such as destruction of natural resources, degradation of biological system, global pollution, and climate change in the world, especially in the developing countries. According to the World Health Organization, as a developing city, Tehran (capital of Iran) is one of the most polluted cities in the world in terms of air pollution. In this study, three pollutants including particulate matter less than 10 microns, nitrogen oxides, and sulfur dioxide were evaluated in Tehran using data mining techniques and through Crisp approach. The data from 21 air pollution measuring stations in different areas of Tehran were collected from 1999 to 2013. Commercial softwares Clementine was selected for this study. Tehran was divided into distinct clusters in terms of the mentioned pollutants using the software. As a data mining technique, clustering is usually used as a prologue for other analyses, therefore, the similarity of clusters was evaluated in this study through analyzing local conditions, traffic behavior, and industrial activities. In fact, the results of this research can support decision-making system, help managers improve the performance and decision making, and assist in urban studies.Keywords: data mining, clustering, air pollution, crisp approach
Procedia PDF Downloads 42724385 Muddle Effort for Organized Crime in India: Social Work Concern for Anti Human Trafficking Unit
Authors: Rajkamal Ajmeri, Leena Mehta
Abstract:
Growing magnitude of human trafficking is the indicatory symptom of ill society. Despite of many treaties, legislation and protocols control over human trafficking require additional attention. However, many Anti Human Trafficking Units (AHTU) are working throughout India but it is a fact that incidence pertaining to illegal human trade is not fully under control. Social work as discipline and practice base profession has a lot of concern about situation and the trafficked victims. United state put Indian in tier II watch list because they are not fully complying with the minimum standard of Trafficking Victims Protection laws but they are making a significant effort to bring themselves into compliance with those standards. In order to solve the issue, scientific research of experiences and opinions of government / non government machineries can play an effective role in raising the standard legislation for trafficked victims. Proper study can enhance understanding on various problems faced by government machineries. The study can help in developing the scientific model, which can effectively solve the problem in human trafficking field.Keywords: human trafficking, legislations, victims, social work, government machinery
Procedia PDF Downloads 29824384 Application of Forensic Entomology to Estimate the Post Mortem Interval
Authors: Meriem Taleb, Ghania Tail, Fatma Zohra Kara, Brahim Djedouani, T. Moussa
Abstract:
Forensic entomology has grown immensely as a discipline in the past thirty years. The main purpose of forensic entomology is to establish the post mortem interval or PMI. Three days after the death, insect evidence is often the most accurate and sometimes the only method of determining elapsed time since death. This work presents the estimation of the PMI in an experiment to test the reliability of the accumulated degree days (ADD) method and the application of this method in a real case. The study was conducted at the Laboratory of Entomology at the National Institute for Criminalistics and Criminology of the National Gendarmerie, Algeria. The domestic rabbit Oryctolagus cuniculus L. was selected as the animal model. On 08th July 2012, the animal was killed. Larvae were collected and raised to adulthood. Estimation of oviposition time was calculated by summing up average daily temperatures minus minimum development temperature (also specific to each species). When the sum is reached, it corresponds to the oviposition day. Weather data were obtained from the nearest meteorological station. After rearing was accomplished, three species emerged: Lucilia sericata, Chrysomya albiceps, and Sarcophaga africa. For Chrysomya albiceps species, a cumulation of 186°C is necessary. The emergence of adults occured on 22nd July 2012. A value of 193.4°C is reached on 9th August 2012. Lucilia sericata species require a cumulation of 207°C. The emergence of adults occurred on 23rd, July 2012. A value of 211.35°C is reached on 9th August 2012. We should also consider that oviposition may occur more than 12 hours after death. Thus, the obtained PMI is in agreement with the actual time of death. We illustrate the use of this method during the investigation of a case of a decaying human body found on 03rd March 2015 in Bechar, South West of Algerian desert. Maggots were collected and sent to the Laboratory of Entomology. Lucilia sericata adults were identified on 24th March 2015 after emergence. A sum of 211.6°C was reached on 1st March 2015 which corresponds to the estimated day of oviposition. Therefore, the estimated date of death is 1st March 2015 ± 24 hours. The estimated PMI by accumulated degree days (ADD) method seems to be very precise. Entomological evidence should always be used in homicide investigations when the time of death cannot be determined by other methods.Keywords: forensic entomology, accumulated degree days, postmortem interval, diptera, Algeria
Procedia PDF Downloads 29424383 Mass Production of Endemic Diatoms in Polk County, Florida Concomitant with Biofuel Extraction
Authors: Melba D. Horton
Abstract:
Algae are identified as an alternative source of biofuel because of their ubiquitous distribution in aquatic environments. Diatoms are unique forms of algae characterized by silicified cell walls which have gained prominence in various technological applications. Polk County is home to a multitude of ponds and lakes but has not been explored for the presence of diatoms. Considering the condition of the waters brought about by predominant phosphate mining activities in the area, this research was conducted to determine if endemic diatoms are present and explore their potential for low-cost mass production. Using custom-built photobioreactors, water samples from various lakes provided by the Polk County Parks and Recreation and from nearby ponds were used as the source of diatoms together with other algae obtained during collection. Results of the initial culture cycles were successful, but later an overgrowth of other algae crashed the diatom population. Experiments were conducted in the laboratory to tease out some factors possibly contributing to the die-off. Generally, the total biomass declines after two culture cycles and the causative factors need further investigation. The lipid yield is minimum; however, the high frustule production after die-off adds value to the overall benefit of the harvest.Keywords: diatoms, algae, biofuel, lipid, photobioreactor, frustule
Procedia PDF Downloads 18824382 Modified InVEST for Whatsapp Messages Forensic Triage and Search through Visualization
Authors: Agria Rhamdhan
Abstract:
WhatsApp as the most popular mobile messaging app has been used as evidence in many criminal cases. As the use of mobile messages generates large amounts of data, forensic investigation faces the challenge of large data problems. The hardest part of finding this important evidence is because current practice utilizes tools and technique that require manual analysis to check all messages. That way, analyze large sets of mobile messaging data will take a lot of time and effort. Our work offers methodologies based on forensic triage to reduce large data to manageable sets resulting easier to do detailed reviews, then show the results through interactive visualization to show important term, entities and relationship through intelligent ranking using Term Frequency-Inverse Document Frequency (TF-IDF) and Latent Dirichlet Allocation (LDA) Model. By implementing this methodology, investigators can improve investigation processing time and result's accuracy.Keywords: forensics, triage, visualization, WhatsApp
Procedia PDF Downloads 16824381 Low Cost Webcam Camera and GNSS Integration for Updating Home Data Using AI Principles
Authors: Mohkammad Nur Cahyadi, Hepi Hapsari Handayani, Agus Budi Raharjo, Ronny Mardianto, Daud Wahyu Imani, Arizal Bawazir, Luki Adi Triawan
Abstract:
PDAM (local water company) determines customer charges by considering the customer's building or house. Charges determination significantly affects PDAM income and customer costs because the PDAM applies a subsidy policy for customers classified as small households. Periodic updates are needed so that pricing is in line with the target. A thorough customer survey in Surabaya is needed to update customer building data. However, the survey that has been carried out so far has been by deploying officers to conduct one-by-one surveys for each PDAM customer. Surveys with this method require a lot of effort and cost. For this reason, this research offers a technology called moblie mapping, a mapping method that is more efficient in terms of time and cost. The use of this tool is also quite simple, where the device will be installed in the car so that it can record the surrounding buildings while the car is running. Mobile mapping technology generally uses lidar sensors equipped with GNSS, but this technology requires high costs. In overcoming this problem, this research develops low-cost mobile mapping technology using a webcam camera sensor added to the GNSS and IMU sensors. The camera used has specifications of 3MP with a resolution of 720 and a diagonal field of view of 78⁰. The principle of this invention is to integrate four camera sensors, a GNSS webcam, and GPS to acquire photo data, which is equipped with location data (latitude, longitude) and IMU (roll, pitch, yaw). This device is also equipped with a tripod and a vacuum cleaner to attach to the car's roof so it doesn't fall off while running. The output data from this technology will be analyzed with artificial intelligence to reduce similar data (Cosine Similarity) and then classify building types. Data reduction is used to eliminate similar data and maintain the image that displays the complete house so that it can be processed for later classification of buildings. The AI method used is transfer learning by utilizing a trained model named VGG-16. From the analysis of similarity data, it was found that the data reduction reached 50%. Then georeferencing is done using the Google Maps API to get address information according to the coordinates in the data. After that, geographic join is done to link survey data with customer data already owned by PDAM Surya Sembada Surabaya.Keywords: mobile mapping, GNSS, IMU, similarity, classification
Procedia PDF Downloads 8424380 An Investigation into the Views of Distant Science Education Students Regarding Teaching Laboratory Work Online
Authors: Abraham Motlhabane
Abstract:
This research analysed the written views of science education students regarding the teaching of laboratory work using the online mode. The research adopted the qualitative methodology. The qualitative research was aimed at investigating small and distinct groups normally regarded as a single-site study. Qualitative research was used to describe and analyze the phenomena from the student’s perspective. This means the research began with assumptions of the world view that use theoretical lenses of research problems inquiring into the meaning of individual students. The research was conducted with three groups of students studying for Postgraduate Certificate in Education, Bachelor of Education and honors Bachelor of Education respectively. In each of the study programmes, the science education module is compulsory. Five science education students from each study programme were purposively selected to participate in this research. Therefore, 15 students participated in the research. In order to analysis the data, the data were first printed and hard copies were used in the analysis. The data was read several times and key concepts and ideas were highlighted. Themes and patterns were identified to describe the data. Coding as a process of organising and sorting data was used. The findings of the study are very diverse; some students are in favour of online laboratory whereas other students argue that science can only be learnt through hands-on experimentation.Keywords: online learning, laboratory work, views, perceptions
Procedia PDF Downloads 14524379 An Improved Adaptive Dot-Shape Beamforming Algorithm Research on Frequency Diverse Array
Authors: Yanping Liao, Zenan Wu, Ruigang Zhao
Abstract:
Frequency diverse array (FDA) beamforming is a technology developed in recent years, and its antenna pattern has a unique angle-distance-dependent characteristic. However, the beam is always required to have strong concentration, high resolution and low sidelobe level to form the point-to-point interference in the concentrated set. In order to eliminate the angle-distance coupling of the traditional FDA and to make the beam energy more concentrated, this paper adopts a multi-carrier FDA structure based on proposed power exponential frequency offset to improve the array structure and frequency offset of the traditional FDA. The simulation results show that the beam pattern of the array can form a dot-shape beam with more concentrated energy, and its resolution and sidelobe level performance are improved. However, the covariance matrix of the signal in the traditional adaptive beamforming algorithm is estimated by the finite-time snapshot data. When the number of snapshots is limited, the algorithm has an underestimation problem, which leads to the estimation error of the covariance matrix to cause beam distortion, so that the output pattern cannot form a dot-shape beam. And it also has main lobe deviation and high sidelobe level problems in the case of limited snapshot. Aiming at these problems, an adaptive beamforming technique based on exponential correction for multi-carrier FDA is proposed to improve beamforming robustness. The steps are as follows: first, the beamforming of the multi-carrier FDA is formed under linear constrained minimum variance (LCMV) criteria. Then the eigenvalue decomposition of the covariance matrix is performed to obtain the diagonal matrix composed of the interference subspace, the noise subspace and the corresponding eigenvalues. Finally, the correction index is introduced to exponentially correct the small eigenvalues of the noise subspace, improve the divergence of small eigenvalues in the noise subspace, and improve the performance of beamforming. The theoretical analysis and simulation results show that the proposed algorithm can make the multi-carrier FDA form a dot-shape beam at limited snapshots, reduce the sidelobe level, improve the robustness of beamforming, and have better performance.Keywords: adaptive beamforming, correction index, limited snapshot, multi-carrier frequency diverse array, robust
Procedia PDF Downloads 13024378 Inner and Outer School Contextual Factors Associated with Poor Performance of Grade 12 Students: A Case Study of an Underperforming High School in Mpumalanga, South Africa
Authors: Victoria L. Nkosi, Parvaneh Farhangpour
Abstract:
Often a Grade 12 certificate is perceived as a passport to tertiary education and the minimum requirement to enter the world of work. In spite of its importance, many students do not make this milestone in South Africa. It is important to find out why so many students still fail in spite of transformation in the education system in the post-apartheid era. Given the complexity of education and its context, this study adopted a case study design to examine one historically underperforming high school in Bushbuckridge, Mpumalanga Province, South Africa in 2013. The aim was to gain a understanding of the inner and outer school contextual factors associated with the high failure rate among Grade 12 students. Government documents and reports were consulted to identify factors in the district and the village surrounding the school and a student survey was conducted to identify school, home and student factors. The randomly-sampled half of the population of Grade 12 students (53) participated in the survey and quantitative data are analyzed using descriptive statistical methods. The findings showed that a host of factors is at play. The school is located in a village within a municipality which has been one of the poorest three municipalities in South Africa and the lowest Grade 12 pass rate in the Mpumalanga province. Moreover, over half of the families of the students are single parents, 43% are unemployed and the majority has a low level of education. In addition, most families (83%) do not have basic study materials such as a dictionary, books, tables, and chairs. A significant number of students (70%) are over-aged (+19 years old); close to half of them (49%) are grade repeaters. The school itself lacks essential resources, namely computers, science laboratories, library, and enough furniture and textbooks. Moreover, teaching and learning are negatively affected by the teachers’ occasional absenteeism, inadequate lesson preparation, and poor communication skills. Overall, the continuous low performance of students in this school mirrors the vicious circle of multiple negative conditions present within and outside of the school. The complexity of factors associated with the underperformance of Grade 12 students in this school calls for a multi-dimensional intervention from government and stakeholders. One important intervention should be the placement of over-aged students and grade-repeaters in suitable educational institutions for the benefit of other students.Keywords: inner context, outer context, over-aged students, vicious cycle
Procedia PDF Downloads 20124377 The Communication Library DIALOG for iFDAQ of the COMPASS Experiment
Authors: Y. Bai, M. Bodlak, V. Frolov, S. Huber, V. Jary, I. Konorov, D. Levit, J. Novy, D. Steffen, O. Subrt, M. Virius
Abstract:
Modern experiments in high energy physics impose great demands on the reliability, the efficiency, and the data rate of Data Acquisition Systems (DAQ). This contribution focuses on the development and deployment of the new communication library DIALOG for the intelligent, FPGA-based Data Acquisition System (iFDAQ) of the COMPASS experiment at CERN. The iFDAQ utilizing a hardware event builder is designed to be able to readout data at the maximum rate of the experiment. The DIALOG library is a communication system both for distributed and mixed environments, it provides a network transparent inter-process communication layer. Using the high-performance and modern C++ framework Qt and its Qt Network API, the DIALOG library presents an alternative to the previously used DIM library. The DIALOG library was fully incorporated to all processes in the iFDAQ during the run 2016. From the software point of view, it might be considered as a significant improvement of iFDAQ in comparison with the previous run. To extend the possibilities of debugging, the online monitoring of communication among processes via DIALOG GUI is a desirable feature. In the paper, we present the DIALOG library from several insights and discuss it in a detailed way. Moreover, the efficiency measurement and comparison with the DIM library with respect to the iFDAQ requirements is provided.Keywords: data acquisition system, DIALOG library, DIM library, FPGA, Qt framework, TCP/IP
Procedia PDF Downloads 31624376 Mining Scientific Literature to Discover Potential Research Data Sources: An Exploratory Study in the Field of Haemato-Oncology
Authors: A. Anastasiou, K. S. Tingay
Abstract:
Background: Discovering suitable datasets is an important part of health research, particularly for projects working with clinical data from patients organized in cohorts (cohort data), but with the proliferation of so many national and international initiatives, it is becoming increasingly difficult for research teams to locate real world datasets that are most relevant to their project objectives. We present a method for identifying healthcare institutes in the European Union (EU) which may hold haemato-oncology (HO) data. A key enabler of this research was the bibInsight platform, a scientometric data management and analysis system developed by the authors at Swansea University. Method: A PubMed search was conducted using HO clinical terms taken from previous work. The resulting XML file was processed using the bibInsight platform, linking affiliations to the Global Research Identifier Database (GRID). GRID is an international, standardized list of institutions, including the city and country in which the institution exists, as well as a category of the main business type, e.g., Academic, Healthcare, Government, Company. Countries were limited to the 28 current EU members, and institute type to 'Healthcare'. An article was considered valid if at least one author was affiliated with an EU-based healthcare institute. Results: The PubMed search produced 21,310 articles, consisting of 9,885 distinct affiliations with correspondence in GRID. Of these articles, 760 were from EU countries, and 390 of these were healthcare institutes. One affiliation was excluded as being a veterinary hospital. Two EU countries did not have any publications in our analysis dataset. The results were analysed by country and by individual healthcare institute. Networks both within the EU and internationally show institutional collaborations, which may suggest a willingness to share data for research purposes. Geographical mapping can ensure that data has broad population coverage. Collaborations with industry or government may exclude healthcare institutes that may have embargos or additional costs associated with data access. Conclusions: Data reuse is becoming increasingly important both for ensuring the validity of results, and economy of available resources. The ability to identify potential, specific data sources from over twenty thousand articles in less than an hour could assist in improving knowledge of, and access to, data sources. As our method has not yet specified if these healthcare institutes are holding data, or merely publishing on that topic, future work will involve text mining of data-specific concordant terms to identify numbers of participants, demographics, study methodologies, and sub-topics of interest.Keywords: data reuse, data discovery, data linkage, journal articles, text mining
Procedia PDF Downloads 11524375 Using Data Mining Technique for Scholarship Disbursement
Authors: J. K. Alhassan, S. A. Lawal
Abstract:
This work is on decision tree-based classification for the disbursement of scholarship. Tree-based data mining classification technique is used in other to determine the generic rule to be used to disburse the scholarship. The system based on the defined rules from the tree is able to determine the class (status) to which an applicant shall belong whether Granted or Not Granted. The applicants that fall to the class of granted denote a successful acquirement of scholarship while those in not granted class are unsuccessful in the scheme. An algorithm that can be used to classify the applicants based on the rules from tree-based classification was also developed. The tree-based classification is adopted because of its efficiency, effectiveness, and easy to comprehend features. The system was tested with the data of National Information Technology Development Agency (NITDA) Abuja, a Parastatal of Federal Ministry of Communication Technology that is mandated to develop and regulate information technology in Nigeria. The system was found working according to the specification. It is therefore recommended for all scholarship disbursement organizations.Keywords: classification, data mining, decision tree, scholarship
Procedia PDF Downloads 37624374 [Keynote Speech]: Feature Selection and Predictive Modeling of Housing Data Using Random Forest
Authors: Bharatendra Rai
Abstract:
Predictive data analysis and modeling involving machine learning techniques become challenging in presence of too many explanatory variables or features. Presence of too many features in machine learning is known to not only cause algorithms to slow down, but they can also lead to decrease in model prediction accuracy. This study involves housing dataset with 79 quantitative and qualitative features that describe various aspects people consider while buying a new house. Boruta algorithm that supports feature selection using a wrapper approach build around random forest is used in this study. This feature selection process leads to 49 confirmed features which are then used for developing predictive random forest models. The study also explores five different data partitioning ratios and their impact on model accuracy are captured using coefficient of determination (r-square) and root mean square error (rsme).Keywords: housing data, feature selection, random forest, Boruta algorithm, root mean square error
Procedia PDF Downloads 32324373 Presenting the Mathematical Model to Determine Retention in the Watersheds
Authors: S. Shamohammadi, L. Razavi
Abstract:
This paper based on the principle concepts of SCS-CN model, a new mathematical model for computation of retention potential (S) presented. In the mathematical model, not only precipitation-runoff concepts in SCS-CN model are precisely represented in a mathematical form, but also new concepts, called “maximum retention” and “total retention” is introduced, and concepts of potential retention capacity, maximum retention, and total retention have been separated from each other. In the proposed model, actual retention (F), maximum actual retention (Fmax), total retention (S), maximum retention (Smax), and potential retention (Sp), for the first time clearly defined, so that Sp is not variable, but a function of morphological characteristics of the watershed. Indeed, based on the mathematical relation of the conceptual curve of SCS-CN model, the proposed model provides a new method for the computation of actual retention in watershed and it simply determined runoff based on. In the corresponding relations, in addition to Precipitation (P), Initial retention (Ia), cumulative values of actual retention capacity (F), total retention (S), runoff (Q), antecedent moisture (M), potential retention (Sp), total retention (S), we introduced Fmax and Fmin referring to maximum and minimum actual retention, respectively. As well as, ksh is a coefficient which depends on morphological characteristics of the watershed. Advantages of the modified version versus the original model include a better precision, higher performance, easier calibration and speed computing.Keywords: model, mathematical, retention, watershed, SCS
Procedia PDF Downloads 45724372 Image-Based (RBG) Technique for Estimating Phosphorus Levels of Different Crops
Authors: M. M. Ali, Ahmed Al- Ani, Derek Eamus, Daniel K. Y. Tan
Abstract:
In this glasshouse study, we developed the new image-based non-destructive technique for detecting leaf P status of different crops such as cotton, tomato and lettuce. Plants were allowed to grow on nutrient media containing different P concentrations, i.e. 0%, 50% and 100% of recommended P concentration (P0 = no P, L; P1 = 2.5 mL 10 L-1 of P and P2 = 5 mL 10 L-1 of P as NaH2PO4). After 10 weeks of growth, plants were harvested and data on leaf P contents were collected using the standard destructive laboratory method and at the same time leaf images were collected by a handheld crop image sensor. We calculated leaf area, leaf perimeter and RGB (red, green and blue) values of these images. This data was further used in the linear discriminant analysis (LDA) to estimate leaf P contents, which successfully classified these plants on the basis of leaf P contents. The data indicated that P deficiency in crop plants can be predicted using the image and morphological data. Our proposed non-destructive imaging method is precise in estimating P requirements of different crop species.Keywords: image-based techniques, leaf area, leaf P contents, linear discriminant analysis
Procedia PDF Downloads 38224371 Design of Visual Repository, Constraint and Process Modeling Tool Based on Eclipse Plug-Ins
Authors: Rushiraj Heshi, Smriti Bhandari
Abstract:
Master Data Management requires creation of Central repository, applying constraints on Repository and designing processes to manage data. Designing of Repository, constraints on repository and business processes is very tedious and time consuming task for large Enterprise. Hence Visual Repository, constraints and Process (Workflow) modeling is the most critical step in Master Data Management.In this paper, we realize a Visual Modeling tool for implementing Repositories, Constraints and Processes based on Eclipse Plugin using GMF/EMF which follows principles of Model Driven Engineering (MDE).Keywords: EMF, GMF, GEF, repository, constraint, process
Procedia PDF Downloads 49724370 The Classification Performance in Parametric and Nonparametric Discriminant Analysis for a Class- Unbalanced Data of Diabetes Risk Groups
Authors: Lily Ingsrisawang, Tasanee Nacharoen
Abstract:
Introduction: The problems of unbalanced data sets generally appear in real world applications. Due to unequal class distribution, many research papers found that the performance of existing classifier tends to be biased towards the majority class. The k -nearest neighbors’ nonparametric discriminant analysis is one method that was proposed for classifying unbalanced classes with good performance. Hence, the methods of discriminant analysis are of interest to us in investigating misclassification error rates for class-imbalanced data of three diabetes risk groups. Objective: The purpose of this study was to compare the classification performance between parametric discriminant analysis and nonparametric discriminant analysis in a three-class classification application of class-imbalanced data of diabetes risk groups. Methods: Data from a healthy project for 599 staffs in a government hospital in Bangkok were obtained for the classification problem. The staffs were diagnosed into one of three diabetes risk groups: non-risk (90%), risk (5%), and diabetic (5%). The original data along with the variables; diabetes risk group, age, gender, cholesterol, and BMI was analyzed and bootstrapped up to 50 and 100 samples, 599 observations per sample, for additional estimation of misclassification error rate. Each data set was explored for the departure of multivariate normality and the equality of covariance matrices of the three risk groups. Both the original data and the bootstrap samples show non-normality and unequal covariance matrices. The parametric linear discriminant function, quadratic discriminant function, and the nonparametric k-nearest neighbors’ discriminant function were performed over 50 and 100 bootstrap samples and applied to the original data. In finding the optimal classification rule, the choices of prior probabilities were set up for both equal proportions (0.33: 0.33: 0.33) and unequal proportions with three choices of (0.90:0.05:0.05), (0.80: 0.10: 0.10) or (0.70, 0.15, 0.15). Results: The results from 50 and 100 bootstrap samples indicated that the k-nearest neighbors approach when k = 3 or k = 4 and the prior probabilities of {non-risk:risk:diabetic} as {0.90:0.05:0.05} or {0.80:0.10:0.10} gave the smallest error rate of misclassification. Conclusion: The k-nearest neighbors approach would be suggested for classifying a three-class-imbalanced data of diabetes risk groups.Keywords: error rate, bootstrap, diabetes risk groups, k-nearest neighbors
Procedia PDF Downloads 435