Search results for: Data Mining
Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 24815

Search results for: Data Mining

24035 Ontological Modeling Approach for Statistical Databases Publication in Linked Open Data

Authors: Bourama Mane, Ibrahima Fall, Mamadou Samba Camara, Alassane Bah

Abstract:

At the level of the National Statistical Institutes, there is a large volume of data which is generally in a format which conditions the method of publication of the information they contain. Each household or business data collection project includes a dissemination platform for its implementation. Thus, these dissemination methods previously used, do not promote rapid access to information and especially does not offer the option of being able to link data for in-depth processing. In this paper, we present an approach to modeling these data to publish them in a format intended for the Semantic Web. Our objective is to be able to publish all this data in a single platform and offer the option to link with other external data sources. An application of the approach will be made on data from major national surveys such as the one on employment, poverty, child labor and the general census of the population of Senegal.

Keywords: Semantic Web, linked open data, database, statistic

Procedia PDF Downloads 162
24034 Destination Management Organization in the Digital Era: A Data Framework to Leverage Collective Intelligence

Authors: Alfredo Fortunato, Carmelofrancesco Origlia, Sara Laurita, Rossella Nicoletti

Abstract:

In the post-pandemic recovery phase of tourism, the role of a Destination Management Organization (DMO) as a coordinated management system of all the elements that make up a destination (attractions, access, marketing, human resources, brand, pricing, etc.) is also becoming relevant for local territories. The objective of a DMO is to maximize the visitor's perception of value and quality while ensuring the competitiveness and sustainability of the destination, as well as the long-term preservation of its natural and cultural assets, and to catalyze benefits for the local economy and residents. In carrying out the multiple functions to which it is called, the DMO can leverage a collective intelligence that comes from the ability to pool information, explicit and tacit knowledge, and relationships of the various stakeholders: policymakers, public managers and officials, entrepreneurs in the tourism supply chain, researchers, data journalists, schools, associations and committees, citizens, etc. The DMO potentially has at its disposal large volumes of data and many of them at low cost, that need to be properly processed to produce value. Based on these assumptions, the paper presents a conceptual framework for building an information system to support the DMO in the intelligent management of a tourist destination tested in an area of southern Italy. The approach adopted is data-informed and consists of four phases: (1) formulation of the knowledge problem (analysis of policy documents and industry reports; focus groups and co-design with stakeholders; definition of information needs and key questions); (2) research and metadatation of relevant sources (reconnaissance of official sources, administrative archives and internal DMO sources); (3) gap analysis and identification of unconventional information sources (evaluation of traditional sources with respect to the level of consistency with information needs, the freshness of information and granularity of data; enrichment of the information base by identifying and studying web sources such as Wikipedia, Google Trends, Booking.com, Tripadvisor, websites of accommodation facilities and online newspapers); (4) definition of the set of indicators and construction of the information base (specific definition of indicators and procedures for data acquisition, transformation, and analysis). The framework derived consists of 6 thematic areas (accommodation supply, cultural heritage, flows, value, sustainability, and enabling factors), each of which is divided into three domains that gather a specific information need to be represented by a scheme of questions to be answered through the analysis of available indicators. The framework is characterized by a high degree of flexibility in the European context, given that it can be customized for each destination by adapting the part related to internal sources. Application to the case study led to the creation of a decision support system that allows: •integration of data from heterogeneous sources, including through the execution of automated web crawling procedures for data ingestion of social and web information; •reading and interpretation of data and metadata through guided navigation paths in the key of digital story-telling; •implementation of complex analysis capabilities through the use of data mining algorithms such as for the prediction of tourist flows.

Keywords: collective intelligence, data framework, destination management, smart tourism

Procedia PDF Downloads 105
24033 The Role of Data Protection Officer in Managing Individual Data: Issues and Challenges

Authors: Nazura Abdul Manap, Siti Nur Farah Atiqah Salleh

Abstract:

For decades, the misuse of personal data has been a critical issue. Malaysia has accepted responsibility by implementing the Malaysian Personal Data Protection Act 2010 to secure personal data (PDPA 2010). After more than a decade, this legislation is set to be revised by the current PDPA 2023 Amendment Bill to align with the world's key personal data protection regulations, such as the European Union General Data Protection Regulations (GDPR). Among the other suggested adjustments is the Data User's appointment of a Data Protection Officer (DPO) to ensure the commercial entity's compliance with the PDPA 2010 criteria. The change is expected to be enacted in parliament fairly soon; nevertheless, based on the experience of the Personal Data Protection Department (PDPD) in implementing the Act, it is projected that there will be a slew of additional concerns associated with the DPO mandate. Consequently, the goal of this article is to highlight the issues that the DPO will encounter and how the Personal Data Protection Department should respond to this subject. The study result was produced using a qualitative technique based on an examination of the current literature. This research reveals that there are probable obstacles experienced by the DPO, and thus, there should be a definite, clear guideline in place to aid DPO in executing their tasks. It is argued that appointing a DPO is a wise measure in ensuring that the legal data security requirements are met.

Keywords: guideline, law, data protection officer, personal data

Procedia PDF Downloads 62
24032 Benthic Foraminiferal Responses to Coastal Pollution for Some Selected Sites along Red Sea, Egypt

Authors: Ramadan M. El-Kahawy, M. A. El-Shafeiy, Mohamed Abd El-Wahab, S. A. Helal, Nabil Aboul-Ela

Abstract:

Due to the economic importance of Safaga Bay, Quseir harbor and Ras Gharib harbor , a multidisciplinary approach was adopted to invistigate 27 surfecial sediment samples from the three sites and 9 samples for each in order to use the benthic foraminifera as bio-indicators for characterization of the environmental variations. Grain size analyses indicate that the bottom facies in the inner part of quseir is muddy while the inner part of Ras Gharib and Safaga is silty sand and those close to the entrance of Safaga bay and Ras Gharib is sandy facies while quseir still also muddy facies. geochemical data show high concentration of heavy-metals mainly in Ras Gharib due to oil leakage from the hydrocarbon oil field and Safaga bay due to the phosphate mining while quseir is medium concentration due to anthropocentric effect.micropaelontological analyses indicate the boundaries of the highest concentration of heavy metals and those of low concentration as well.the dominant benthic foraminifera in these three sites are Ammonia beccarii, Amphistigina and sorites. the study highlights the worsening of environmental conditions and also show that the areas in need of a priority recovery.

Keywords: benthic foraminifera, Ras Gharib, Safaga, Quseir, Red Sea, Egypt

Procedia PDF Downloads 329
24031 Career Guidance System Using Machine Learning

Authors: Mane Darbinyan, Lusine Hayrapetyan, Elen Matevosyan

Abstract:

Artificial Intelligence in Education (AIED) has been created to help students get ready for the workforce, and over the past 25 years, it has grown significantly, offering a variety of technologies to support academic, institutional, and administrative services. However, this is still challenging, especially considering the labor market's rapid change. While choosing a career, people face various obstacles because they do not take into consideration their own preferences, which might lead to many other problems like shifting jobs, work stress, occupational infirmity, reduced productivity, and manual error. Besides preferences, people should properly evaluate their technical and non-technical skills, as well as their personalities. Professional counseling has become a difficult undertaking for counselors due to the wide range of career choices brought on by changing technological trends. It is necessary to close this gap by utilizing technology that makes sophisticated predictions about a person's career goals based on their personality. Hence, there is a need to create an automated model that would help in decision-making based on user inputs. Improving career guidance can be achieved by embedding machine learning into the career consulting ecosystem. There are various systems of career guidance that work based on the same logic, such as the classification of applicants, matching applications with appropriate departments or jobs, making predictions, and providing suitable recommendations. Methodologies like KNN, Neural Networks, K-means clustering, D-Tree, and many other advanced algorithms are applied in the fields of data and compute some data, which is helpful to predict the right careers. Besides helping users with their career choice, these systems provide numerous opportunities which are very useful while making this hard decision. They help the candidate to recognize where he/she specifically lacks sufficient skills so that the candidate can improve those skills. They are also capable to offer an e-learning platform, taking into account the user's lack of knowledge. Furthermore, users can be provided with details on a particular job, such as the abilities required to excel in that industry.

Keywords: career guidance system, machine learning, career prediction, predictive decision, data mining, technical and non-technical skills

Procedia PDF Downloads 66
24030 Data Collection Based on the Questionnaire Survey In-Hospital Emergencies

Authors: Nouha Mhimdi, Wahiba Ben Abdessalem Karaa, Henda Ben Ghezala

Abstract:

The methods identified in data collection are diverse: electronic media, focus group interviews and short-answer questionnaires [1]. The collection of poor-quality data resulting, for example, from poorly designed questionnaires, the absence of good translators or interpreters, and the incorrect recording of data allow conclusions to be drawn that are not supported by the data or to focus only on the average effect of the program or policy. There are several solutions to avoid or minimize the most frequent errors, including obtaining expert advice on the design or adaptation of data collection instruments; or use technologies allowing better "anonymity" in the responses [2]. In this context, we opted to collect good quality data by doing a sizeable questionnaire-based survey on hospital emergencies to improve emergency services and alleviate the problems encountered. At the level of this paper, we will present our study, and we will detail the steps followed to achieve the collection of relevant, consistent and practical data.

Keywords: data collection, survey, questionnaire, database, data analysis, hospital emergencies

Procedia PDF Downloads 90
24029 Power Asymmetry and Major Corporate Social Responsibility Projects in Mhondoro-Ngezi District, Zimbabwe

Authors: A. T. Muruviwa

Abstract:

Empirical studies of the current CSR agenda have been dominated by literature from the North at the expense of the nations from the South where most TNCs are located. Therefore, owing to the limitations of the current discourse that is dominated by Western ideas such as voluntarism, philanthropy, business case and economic gains, scholars have been calling for a new CSR agenda that is South-centred and addresses the needs of developing nations. The development theme has dominated in the recent literature as scholars concerned with the relationship between business and society have tried to understand its relationship with CSR. Despite a plethora of literature on the roles of corporations in local communities and the impact of CSR initiatives, there is lack of adequate empirical evidence to help us understand the nexus between CSR and development. For all the claims made about the positive and negative consequences of CSR, there is surprisingly little information about the outcomes it delivers. This study is a response to these claims made about the developmental aspect of CSR in developing countries. It offers some empirical bases for assessing the major CSR projects that have been fulfilled by a major mining company, Zimplats in Mhondoro-Ngezi Zimbabwe. The neo-liberal idea of capitalism and market dominations has empowered TNCs to stamp their authority in the developing countries. TNCs have made their mark in developing nations as they stamp their global private authority, rivalling or implicitly challenging the state in many functions. This dominance of corporate power raises great concerns over their tendencies of abuses in terms of environmental, social and human rights concerns as well as how to make them increasingly accountable. The hegemonic power of TNCs in the developing countries has had a tremendous impact on the overall CSR practices. While TNCs are key drivers of globalization they may be acting responsibly in their Global Northern home countries where there is a combination of legal mechanisms and the fear of civil society activism associated with corporate scandals. Using a triangulated approach in which both qualitative and quantitative methods were used the study found out that most CSR projects in Zimbabwe are dominated and directed by Zimplats because of the power it possesses. Most of the major CSR projects are beneficial to the mining company as they serve the business plans of the mining company. What was deduced from the study is that the infrastructural development initiatives by Zimplats confirm that CSR is a tool to advance business obligations. This shows that although proponents of CSR might claim that business has a mandate for social obligations to society, we need not to forget the dominant idea that the primary function of CSR is to enhance the firm’s profitability.

Keywords: hegemonic power, projects, reciprocity, stakeholders

Procedia PDF Downloads 234
24028 Career Guidance System Using Machine Learning

Authors: Mane Darbinyan, Lusine Hayrapetyan, Elen Matevosyan

Abstract:

Artificial Intelligence in Education (AIED) has been created to help students get ready for the workforce, and over the past 25 years, it has grown significantly, offering a variety of technologies to support academic, institutional, and administrative services. However, this is still challenging, especially considering the labor market's rapid change. While choosing a career, people face various obstacles because they do not take into consideration their own preferences, which might lead to many other problems like shifting jobs, work stress, occupational infirmity, reduced productivity, and manual error. Besides preferences, people should evaluate properly their technical and non-technical skills, as well as their personalities. Professional counseling has become a difficult undertaking for counselors due to the wide range of career choices brought on by changing technological trends. It is necessary to close this gap by utilizing technology that makes sophisticated predictions about a person's career goals based on their personality. Hence, there is a need to create an automated model that would help in decision-making based on user inputs. Improving career guidance can be achieved by embedding machine learning into the career consulting ecosystem. There are various systems of career guidance that work based on the same logic, such as the classification of applicants, matching applications with appropriate departments or jobs, making predictions, and providing suitable recommendations. Methodologies like KNN, neural networks, K-means clustering, D-Tree, and many other advanced algorithms are applied in the fields of data and compute some data, which is helpful to predict the right careers. Besides helping users with their career choice, these systems provide numerous opportunities which are very useful while making this hard decision. They help the candidate to recognize where he/she specifically lacks sufficient skills so that the candidate can improve those skills. They are also capable of offering an e-learning platform, taking into account the user's lack of knowledge. Furthermore, users can be provided with details on a particular job, such as the abilities required to excel in that industry.

Keywords: career guidance system, machine learning, career prediction, predictive decision, data mining, technical and non-technical skills

Procedia PDF Downloads 53
24027 Federated Learning in Healthcare

Authors: Ananya Gangavarapu

Abstract:

Convolutional Neural Networks (CNN) based models are providing diagnostic capabilities on par with the medical specialists in many specialty areas. However, collecting the medical data for training purposes is very challenging because of the increased regulations around data collections and privacy concerns around personal health data. The gathering of the data becomes even more difficult if the capture devices are edge-based mobile devices (like smartphones) with feeble wireless connectivity in rural/remote areas. In this paper, I would like to highlight Federated Learning approach to mitigate data privacy and security issues.

Keywords: deep learning in healthcare, data privacy, federated learning, training in distributed environment

Procedia PDF Downloads 122
24026 Allele Mining for Rice Sheath Blight Resistance by Whole-Genome Association Mapping in a Tail-End Population

Authors: Naoki Yamamoto, Hidenobu Ozaki, Taiichiro Ookawa, Youming Liu, Kazunori Okada, Aiping Zheng

Abstract:

Rice sheath blight is one of the destructive fungal diseases in rice. We have thought that rice sheath blight resistance is a polygenic trait. Host-pathogen interactions and secondary metabolites such as lignin and phytoalexins are likely to be involved in defense against R. solani. However, to our knowledge, it is still unknown how sheath blight resistance can be enhanced in rice breeding. To seek for an alternative genetic factor that contribute to sheath blight resistance, we mined relevant allelic variations from rice core collections created in Japan. Based on disease lesion length on detached leaf sheath, we selected 30 varieties of the top tail-end and the bottom tail-end, respectively, from the core collections to perform genome-wide association mapping. Re-sequencing reads for these varieties were used for calling single nucleotide polymorphisms among the 60 varieties to create a SNP panel, which contained 1,137,131 homozygous variant sites after filitering. Association mapping highlighted a locus on the long arm of chromosome 11, which is co-localized with three sheath blight QTLs, qShB11-2-TX, qShB11, and qSBR-11-2. Based on the localization of the trait-associated alleles, we identified an ankyryn repeat-containing protein gene (ANK-M) as an uncharacterized candidate factor for rice sheath blight resistance. Allelic distributions for ANK-M in the whole rice population supported the reliability of trait-allele associations. Gene expression characteristics were checked to evaluiate the functionality of ANK-M. Since an ANK-M homolog (OsPIANK1) in rice seems a basal defense regulator against rice blast and bacterial leaf blight, ANK-M may also play a role in the rice immune system.

Keywords: allele mining, GWAS, QTL, rice sheath blight

Procedia PDF Downloads 59
24025 The Utilization of Big Data in Knowledge Management Creation

Authors: Daniel Brian Thompson, Subarmaniam Kannan

Abstract:

The huge weightage of knowledge in this world and within the repository of organizations has already reached immense capacity and is constantly increasing as time goes by. To accommodate these constraints, Big Data implementation and algorithms are utilized to obtain new or enhanced knowledge for decision-making. With the transition from data to knowledge provides the transformational changes which will provide tangible benefits to the individual implementing these practices. Today, various organization would derive knowledge from observations and intuitions where this information or data will be translated into best practices for knowledge acquisition, generation and sharing. Through the widespread usage of Big Data, the main intention is to provide information that has been cleaned and analyzed to nurture tangible insights for an organization to apply to their knowledge-creation practices based on facts and figures. The translation of data into knowledge will generate value for an organization to make decisive decisions to proceed with the transition of best practices. Without a strong foundation of knowledge and Big Data, businesses are not able to grow and be enhanced within the competitive environment.

Keywords: big data, knowledge management, data driven, knowledge creation

Procedia PDF Downloads 93
24024 Survey on Data Security Issues Through Cloud Computing Amongst Sme’s in Nairobi County, Kenya

Authors: Masese Chuma Benard, Martin Onsiro Ronald

Abstract:

Businesses have been using cloud computing more frequently recently because they wish to take advantage of its advantages. However, employing cloud computing also introduces new security concerns, particularly with regard to data security, potential risks and weaknesses that could be exploited by attackers, and various tactics and strategies that could be used to lessen these risks. This study examines data security issues on cloud computing amongst sme’s in Nairobi county, Kenya. The study used the sample size of 48, the research approach was mixed methods, The findings show that data owner has no control over the cloud merchant's data management procedures, there is no way to ensure that data is handled legally. This implies that you will lose control over the data stored in the cloud. Data and information stored in the cloud may face a range of availability issues due to internet outages; this can represent a significant risk to data kept in shared clouds. Integrity, availability, and secrecy are all mentioned.

Keywords: data security, cloud computing, information, information security, small and medium-sized firms (SMEs)

Procedia PDF Downloads 65
24023 Cloud Design for Storing Large Amount of Data

Authors: M. Strémy, P. Závacký, P. Cuninka, M. Juhás

Abstract:

Main goal of this paper is to introduce our design of private cloud for storing large amount of data, especially pictures, and to provide good technological backend for data analysis based on parallel processing and business intelligence. We have tested hypervisors, cloud management tools, storage for storing all data and Hadoop to provide data analysis on unstructured data. Providing high availability, virtual network management, logical separation of projects and also rapid deployment of physical servers to our environment was also needed.

Keywords: cloud, glusterfs, hadoop, juju, kvm, maas, openstack, virtualization

Procedia PDF Downloads 339
24022 Groundwater Treatment of Thailand's Mae Moh Lignite Mine

Authors: A. Laksanayothin, W. Ariyawong

Abstract:

Mae Moh Lignite Mine is the largest open-pit mine in Thailand. The mine serves coal to the power plant about 16 million tons per year. This amount of coal can produce electricity accounting for about 10% of Nation’s electric power generation. The mining area of Mae Moh Mine is about 28 km2. At present, the deepest area of the pit is about 280 m from ground level (+40 m. MSL) and in the future the depth of the pit can reach 520 m from ground level (-200 m.MSL). As the size of the pit is quite large, the stability of the pit is seriously important. Furthermore, the preliminary drilling and extended drilling in year 1989-1996 had found high pressure aquifer under the pit. As a result, the pressure of the underground water has to be released in order to control mine pit stability. The study by the consulting experts later found that 3-5 million m3 per year of the underground water is needed to be de-watered for the safety of mining. However, the quality of this discharged water should meet the standard. Therefore, the ground water treatment facility has been implemented, aiming to reduce the amount of naturally contaminated Arsenic (As) in discharged water lower than the standard limit of 10 ppb. The treatment system consists of coagulation and filtration process. The main components include rapid mixing tanks, slow mixing tanks, sedimentation tank, thickener tank and sludge drying bed. The treatment process uses 40% FeCl3 as a coagulant. The FeCl3 will adsorb with As(V), forming floc particles and separating from the water as precipitate. After that, the sludge is dried in the sand bed and then be disposed in the secured land fill. Since 2011, the treatment plant of 12,000 m3/day has been efficiently operated. The average removal efficiency of the process is about 95%.

Keywords: arsenic, coagulant, ferric chloride, groundwater, lignite, coal mine

Procedia PDF Downloads 291
24021 Estimation of Missing Values in Aggregate Level Spatial Data

Authors: Amitha Puranik, V. S. Binu, Seena Biju

Abstract:

Missing data is a common problem in spatial analysis especially at the aggregate level. Missing can either occur in covariate or in response variable or in both in a given location. Many missing data techniques are available to estimate the missing data values but not all of these methods can be applied on spatial data since the data are autocorrelated. Hence there is a need to develop a method that estimates the missing values in both response variable and covariates in spatial data by taking account of the spatial autocorrelation. The present study aims to develop a model to estimate the missing data points at the aggregate level in spatial data by accounting for (a) Spatial autocorrelation of the response variable (b) Spatial autocorrelation of covariates and (c) Correlation between covariates and the response variable. Estimating the missing values of spatial data requires a model that explicitly account for the spatial autocorrelation. The proposed model not only accounts for spatial autocorrelation but also utilizes the correlation that exists between covariates, within covariates and between a response variable and covariates. The precise estimation of the missing data points in spatial data will result in an increased precision of the estimated effects of independent variables on the response variable in spatial regression analysis.

Keywords: spatial regression, missing data estimation, spatial autocorrelation, simulation analysis

Procedia PDF Downloads 361
24020 The Fundamental Research and Industrial Application on CO₂+O₂ in-situ Leaching Process in China

Authors: Lixin Zhao, Genmao Zhou

Abstract:

Traditional acid in-situ leaching (ISL) is not suitable for the sandstone uranium deposit with low permeability and high content of carbonate minerals, because of the blocking of calcium sulfate precipitates. Another factor influences the uranium acid in-situ leaching is that the pyrite in ore rocks will react with oxidation reagent and produce lots of sulfate ions which may speed up the precipitation process of calcium sulphate and consume lots of oxidation reagent. Due to the advantages such as less chemical reagent consumption and groundwater pollution, CO₂+O₂ in-situ leaching method has become one of the important research areas in uranium mining. China is the second country where CO₂+O₂ ISL has been adopted in industrial uranium production of the world. It is shown that the CO₂+O₂ ISL in China has been successfully developed. The reaction principle, technical process, well field design and drilling engineering, uranium-bearing solution processing, etc. have been fully studied. At current stage, several uranium mines use CO₂+O₂ ISL method to extract uranium from the ore-bearing aquifers. The industrial application and development potential of CO₂+O₂ ISL method in China are summarized. By using CO₂+O₂ neutral leaching technology, the problem of calcium carbonate and calcium sulfate precipitation have been solved during uranium mining. By reasonably regulating the amount of CO₂ and O₂, related ions and hydro-chemical conditions can be controlled within the limited extent for avoiding the occurrence of calcium sulfate and calcium carbonate precipitation. Based on this premise, the demand of CO₂+O₂ uranium leaching has been met to the maximum extent, which not only realizes the effective leaching of uranium, but also avoids the occurrence and precipitation of calcium carbonate and calcium sulfate, realizing the industrial development of the sandstone type uranium deposit.

Keywords: CO₂+O₂ ISL, industrial production, well field layout, uranium processing

Procedia PDF Downloads 150
24019 Immunization-Data-Quality in Public Health Facilities in the Pastoralist Communities: A Comparative Study Evidence from Afar and Somali Regional States, Ethiopia

Authors: Melaku Tsehay

Abstract:

The Consortium of Christian Relief and Development Associations (CCRDA), and the CORE Group Polio Partners (CGPP) Secretariat have been working with Global Alliance for Vac-cines and Immunization (GAVI) to improve the immunization data quality in Afar and Somali Regional States. The main aim of this study was to compare the quality of immunization data before and after the above interventions in health facilities in the pastoralist communities in Ethiopia. To this end, a comparative-cross-sectional study was conducted on 51 health facilities. The baseline data was collected in May 2019, while the end line data in August 2021. The WHO data quality self-assessment tool (DQS) was used to collect data. A significant improvment was seen in the accuracy of the pentavalent vaccine (PT)1 (p = 0.012) data at the health posts (HP), while PT3 (p = 0.010), and Measles (p = 0.020) at the health centers (HC). Besides, a highly sig-nificant improvment was observed in the accuracy of tetanus toxoid (TT)2 data at HP (p < 0.001). The level of over- or under-reporting was found to be < 8%, at the HP, and < 10% at the HC for PT3. The data completeness was also increased from 72.09% to 88.89% at the HC. Nearly 74% of the health facilities timely reported their respective immunization data, which is much better than the baseline (7.1%) (p < 0.001). These findings may provide some hints for the policies and pro-grams targetting on improving immunization data qaulity in the pastoralist communities.

Keywords: data quality, immunization, verification factor, pastoralist region

Procedia PDF Downloads 78
24018 Pregnant Women in Substance Abuse: Transition of Characteristics and Mining of Association from Teds-a 2011 to 2018

Authors: Md Tareq Ferdous Khan, Shrabanti Mazumder, MB Rao

Abstract:

Background: Substance use during pregnancy is a longstanding public health problem that results in severe consequences for pregnant women and fetuses. Methods: Eight (2011-2018) datasets on pregnant women’s admissions are extracted from TEDS-A. Distributions of sociodemographic, substance abuse behaviors, and clinical characteristics are constructed and compared over the years for trends by the Cochran-Armitage test. Market basket analysis is used in mining the association among polysubstance abuse. Results: Over the years, pregnant woman admissions as the percentage of total and female admissions remain stable, where total annual admissions range from 1.54 to about 2 million with the female share of 33.30% to 35.61%. Pregnant women aged 21-29, 12 or more years of education, white race, unemployed, holding independent living status are among the most vulnerable. Concerns prevail on a significant number of polysubstance users, young age at first use, frequency of daily users, and records of prior admissions (60%). Trends of abused primary substances show a significant rise in heroin (66%) and methamphetamine (46%) over the years, although the latest year shows a considerable downturn. On the other hand, significant decreasing patterns are evident for alcohol (43%), marijuana or hashish (24%), cocaine or crack (23%), other opiates or synthetics (36%), and benzodiazepines (29%). Basket analysis reveals some patterns of co-occurrence of substances consistent over the years. Conclusions: This comprehensive study can work as a reference to identify the most vulnerable groups based on their characteristics and deal with the most hazardous substances from their evidence of co-occurrence.

Keywords: basket analysis, pregnant women, substance abuse, trend analysis

Procedia PDF Downloads 179
24017 Early Gastric Cancer Prediction from Diet and Epidemiological Data Using Machine Learning in Mizoram Population

Authors: Brindha Senthil Kumar, Payel Chakraborty, Senthil Kumar Nachimuthu, Arindam Maitra, Prem Nath

Abstract:

Gastric cancer is predominantly caused by demographic and diet factors as compared to other cancer types. The aim of the study is to predict Early Gastric Cancer (ECG) from diet and lifestyle factors using supervised machine learning algorithms. For this study, 160 healthy individual and 80 cases were selected who had been followed for 3 years (2016-2019), at Civil Hospital, Aizawl, Mizoram. A dataset containing 11 features that are core risk factors for the gastric cancer were extracted. Supervised machine algorithms: Logistic Regression, Naive Bayes, Support Vector Machine (SVM), Multilayer perceptron, and Random Forest were used to analyze the dataset using Python Jupyter Notebook Version 3. The obtained classified results had been evaluated using metrics parameters: minimum_false_positives, brier_score, accuracy, precision, recall, F1_score, and Receiver Operating Characteristics (ROC) curve. Data analysis results showed Naive Bayes - 88, 0.11; Random Forest - 83, 0.16; SVM - 77, 0.22; Logistic Regression - 75, 0.25 and Multilayer perceptron - 72, 0.27 with respect to accuracy and brier_score in percent. Naive Bayes algorithm out performs with very low false positive rates as well as brier_score and good accuracy. Naive Bayes algorithm classification results in predicting ECG showed very satisfactory results using only diet cum lifestyle factors which will be very helpful for the physicians to educate the patients and public, thereby mortality of gastric cancer can be reduced/avoided with this knowledge mining work.

Keywords: Early Gastric cancer, Machine Learning, Diet, Lifestyle Characteristics

Procedia PDF Downloads 139
24016 Identifying Critical Success Factors for Data Quality Management through a Delphi Study

Authors: Maria Paula Santos, Ana Lucas

Abstract:

Organizations support their operations and decision making on the data they have at their disposal, so the quality of these data is remarkably important and Data Quality (DQ) is currently a relevant issue, the literature being unanimous in pointing out that poor DQ can result in large costs for organizations. The literature review identified and described 24 Critical Success Factors (CSF) for Data Quality Management (DQM) that were presented to a panel of experts, who ordered them according to their degree of importance, using the Delphi method with the Q-sort technique, based on an online questionnaire. The study shows that the five most important CSF for DQM are: definition of appropriate policies and standards, control of inputs, definition of a strategic plan for DQ, organizational culture focused on quality of the data and obtaining top management commitment and support.

Keywords: critical success factors, data quality, data quality management, Delphi, Q-Sort

Procedia PDF Downloads 200
24015 Recovery of Au and Other Metals from Old Electronic Components by Leaching and Liquid Extraction Process

Authors: Tomasz Smolinski, Irena Herdzik-Koniecko, Marta Pyszynska, M. Rogowski

Abstract:

Old electronic components can be easily found nowadays. Significant quantities of valuable metals such as gold, silver or copper are used for the production of advanced electronic devices. Old useless electronic device slowly became a new source of precious metals, very often more efficient than natural. For example, it is possible to recover more gold from 1-ton personal computers than seventeen tons of gold ore. It makes urban mining industry very profitable and necessary for sustainable development. For the recovery of metals from waste of electronic equipment, various treatment options based on conventional physical, hydrometallurgical and pyrometallurgical processes are available. In this group hydrometallurgy processes with their relatively low capital cost, low environmental impact, potential for high metal recoveries and suitability for small scale applications, are very promising options. Institute of Nuclear Chemistry and Technology has great experience in hydrometallurgy processes especially focused on recovery metals from industrial and agricultural wastes. At the moment, urban mining project is carried out. The method of effective recovery of valuable metals from central processing units (CPU) components has been developed. The principal processes such as acidic leaching and solvent extraction were used for precious metals recovery from old processors and graphic cards. Electronic components were treated by acidic solution at various conditions. Optimal acid concentration, time of the process and temperature were selected. Precious metals have been extracted to the aqueous phase. At the next step, metals were selectively extracted by organic solvents such as oximes or tributyl phosphate (TBP) etc. Multistage mixer-settler equipment was used. The process was optimized.

Keywords: electronic waste, leaching, hydrometallurgy, metal recovery, solvent extraction

Procedia PDF Downloads 125
24014 Application of Groundwater Level Data Mining in Aquifer Identification

Authors: Liang Cheng Chang, Wei Ju Huang, You Cheng Chen

Abstract:

Investigation and research are keys for conjunctive use of surface and groundwater resources. The hydrogeological structure is an important base for groundwater analysis and simulation. Traditionally, the hydrogeological structure is artificially determined based on geological drill logs, the structure of wells, groundwater levels, and so on. In Taiwan, groundwater observation network has been built and a large amount of groundwater-level observation data are available. The groundwater level is the state variable of the groundwater system, which reflects the system response combining hydrogeological structure, groundwater injection, and extraction. This study applies analytical tools to the observation database to develop a methodology for the identification of confined and unconfined aquifers. These tools include frequency analysis, cross-correlation analysis between rainfall and groundwater level, groundwater regression curve analysis, and decision tree. The developed methodology is then applied to groundwater layer identification of two groundwater systems: Zhuoshui River alluvial fan and Pingtung Plain. The abovementioned frequency analysis uses Fourier Transform processing time-series groundwater level observation data and analyzing daily frequency amplitude of groundwater level caused by artificial groundwater extraction. The cross-correlation analysis between rainfall and groundwater level is used to obtain the groundwater replenishment time between infiltration and the peak groundwater level during wet seasons. The groundwater regression curve, the average rate of groundwater regression, is used to analyze the internal flux in the groundwater system and the flux caused by artificial behaviors. The decision tree uses the information obtained from the above mentioned analytical tools and optimizes the best estimation of the hydrogeological structure. The developed method reaches training accuracy of 92.31% and verification accuracy 93.75% on Zhuoshui River alluvial fan and training accuracy 95.55%, and verification accuracy 100% on Pingtung Plain. This extraordinary accuracy indicates that the developed methodology is a great tool for identifying hydrogeological structures.

Keywords: aquifer identification, decision tree, groundwater, Fourier transform

Procedia PDF Downloads 141
24013 Facilitating Written Biology Assessment in Large-Enrollment Courses Using Machine Learning

Authors: Luanna B. Prevost, Kelli Carter, Margaurete Romero, Kirsti Martinez

Abstract:

Writing is an essential scientific practice, yet, in several countries, the increasing university science class-size limits the use of written assessments. Written assessments allow students to demonstrate their learning in their own words and permit the faculty to evaluate students’ understanding. However, the time and resources required to grade written assessments prohibit their use in large-enrollment science courses. This study examined the use of machine learning algorithms to automatically analyze student writing and provide timely feedback to the faculty about students' writing in biology. Written responses to questions about matter and energy transformation were collected from large-enrollment undergraduate introductory biology classrooms. Responses were analyzed using the LightSide text mining and classification software. Cohen’s Kappa was used to measure agreement between the LightSide models and human raters. Predictive models achieved agreement with human coding of 0.7 Cohen’s Kappa or greater. Models captured that when writing about matter-energy transformation at the ecosystem level, students focused on primarily on the concepts of heat loss, recycling of matter, and conservation of matter and energy. Models were also produced to capture writing about processes such as decomposition and biochemical cycling. The models created in this study can be used to provide automatic feedback about students understanding of these concepts to biology faculty who desire to use formative written assessments in larger enrollment biology classes, but do not have the time or personnel for manual grading.

Keywords: machine learning, written assessment, biology education, text mining

Procedia PDF Downloads 260
24012 Characterization of Tailings From Traditional Panning of Alluvial Gold Ore (A Case Study of Ilesa - Southwestern Nigeria Goldfield Tailings Dumps)

Authors: Olaniyi Awe, Adelana R. Adetunji, Abraham Adeleke

Abstract:

Field observation revealed a lot of artisanal gold mining activities in Ilesa gold belt of southwestern Nigeria. The possibility of alluvial and lode gold deposits in commercial quantities around this location is very high, as there are many resident artisanal gold miners who have been mining and trading alluvial gold ore for decades and to date in the area. Their major process of solid gold recovery from its ore is by gravity concentration using the convectional panning method. This method is simple to learn and fast to recover gold from its alluvial ore, but its effectiveness is based on rules of thumb and the artisanal miners' experience in handling gold ore panning tool while processing the ore. Research samples from five alluvial gold ore tailings dumps were collected and studied. Samples were subjected to particle size analysis and mineralogical and elemental characterization using X-Ray Diffraction (XRD) and Particle-Induced X-ray Emission (PIXE) methods, respectively. The results showed that the tailings were of major quartz in association with albite, plagioclase, mica, gold, calcite and sulphide minerals. The elemental composition analysis revealed a 15ppm of gold concentration in particle size fraction of -90 microns in one of the tailings dumps investigated. These results are significant. It is recommended that heaps of panning tailings should be further reprocessed using other gold recovery methods such as shaking tables, flotation and controlled cyanidation that can efficiently recover fine gold particles that were previously lost into the gold panning tailings. The tailings site should also be well controlled and monitored so that these heavy minerals do not find their way into surrounding water streams and rivers, thereby causing health hazards.

Keywords: gold ore, panning, PIXE, tailings, XRD

Procedia PDF Downloads 71
24011 Comprehensive Study of Data Science

Authors: Asifa Amara, Prachi Singh, Kanishka, Debargho Pathak, Akshat Kumar, Jayakumar Eravelly

Abstract:

Today's generation is totally dependent on technology that uses data as its fuel. The present study is all about innovations and developments in data science and gives an idea about how efficiently to use the data provided. This study will help to understand the core concepts of data science. The concept of artificial intelligence was introduced by Alan Turing in which the main principle was to create an artificial system that can run independently of human-given programs and can function with the help of analyzing data to understand the requirements of the users. Data science comprises business understanding, analyzing data, ethical concerns, understanding programming languages, various fields and sources of data, skills, etc. The usage of data science has evolved over the years. In this review article, we have covered a part of data science, i.e., machine learning. Machine learning uses data science for its work. Machines learn through their experience, which helps them to do any work more efficiently. This article includes a comparative study image between human understanding and machine understanding, advantages, applications, and real-time examples of machine learning. Data science is an important game changer in the life of human beings. Since the advent of data science, we have found its benefits and how it leads to a better understanding of people, and how it cherishes individual needs. It has improved business strategies, services provided by them, forecasting, the ability to attend sustainable developments, etc. This study also focuses on a better understanding of data science which will help us to create a better world.

Keywords: data science, machine learning, data analytics, artificial intelligence

Procedia PDF Downloads 59
24010 Interpreting Privacy Harms from a Non-Economic Perspective

Authors: Christopher Muhawe, Masooda Bashir

Abstract:

With increased Internet Communication Technology(ICT), the virtual world has become the new normal. At the same time, there is an unprecedented collection of massive amounts of data by both private and public entities. Unfortunately, this increase in data collection has been in tandem with an increase in data misuse and data breach. Regrettably, the majority of data breach and data misuse claims have been unsuccessful in the United States courts for the failure of proof of direct injury to physical or economic interests. The requirement to express data privacy harms from an economic or physical stance negates the fact that not all data harms are physical or economic in nature. The challenge is compounded by the fact that data breach harms and risks do not attach immediately. This research will use a descriptive and normative approach to show that not all data harms can be expressed in economic or physical terms. Expressing privacy harms purely from an economic or physical harm perspective negates the fact that data insecurity may result into harms which run counter the functions of privacy in our lives. The promotion of liberty, selfhood, autonomy, promotion of human social relations and the furtherance of the existence of a free society. There is no economic value that can be placed on these functions of privacy. The proposed approach addresses data harms from a psychological and social perspective.

Keywords: data breach and misuse, economic harms, privacy harms, psychological harms

Procedia PDF Downloads 174
24009 A Step Magnitude Haptic Feedback Device and Platform for Better Way to Review Kinesthetic Vibrotactile 3D Design in Professional Training

Authors: Biki Sarmah, Priyanko Raj Mudiar

Abstract:

In the modern world of remotely interactive virtual reality-based learning and teaching, including professional skill-building training and acquisition practices, as well as data acquisition and robotic systems, the revolutionary application or implementation of field-programmable neurostimulator aids and first-hand interactive sensitisation techniques into 3D holographic audio-visual platforms have been a coveted dream of many scholars, professionals, scientists, and students. Integration of 'kinaesthetic vibrotactile haptic perception' along with an actuated step magnitude contact profiloscopy in augmented reality-based learning platforms and professional training can be implemented by using an extremely calculated and well-coordinated image telemetry including remote data mining and control technique. A real-time, computer-aided (PLC-SCADA) field calibration based algorithm must be designed for the purpose. But most importantly, in order to actually realise, as well as to 'interact' with some 3D holographic models displayed over a remote screen using remote laser image telemetry and control, all spatio-physical parameters like cardinal alignment, gyroscopic compensation, as well as surface profile and thermal compositions, must be implemented using zero-order type 1 actuators (or transducers) because they provide zero hystereses, zero backlashes, low deadtime as well as providing a linear, absolutely controllable, intrinsically observable and smooth performance with the least amount of error compensation while ensuring the best ergonomic comfort ever possible for the users.

Keywords: haptic feedback, kinaesthetic vibrotactile 3D design, medical simulation training, piezo diaphragm based actuator

Procedia PDF Downloads 140
24008 Machine Learning Analysis of Student Success in Introductory Calculus Based Physics I Course

Authors: Chandra Prayaga, Aaron Wade, Lakshmi Prayaga, Gopi Shankar Mallu

Abstract:

This paper presents the use of machine learning algorithms to predict the success of students in an introductory physics course. Data having 140 rows pertaining to the performance of two batches of students was used. The lack of sufficient data to train robust machine learning models was compensated for by generating synthetic data similar to the real data. CTGAN and CTGAN with Gaussian Copula (Gaussian) were used to generate synthetic data, with the real data as input. To check the similarity between the real data and each synthetic dataset, pair plots were made. The synthetic data was used to train machine learning models using the PyCaret package. For the CTGAN data, the Ada Boost Classifier (ADA) was found to be the ML model with the best fit, whereas the CTGAN with Gaussian Copula yielded Logistic Regression (LR) as the best model. Both models were then tested for accuracy with the real data. ROC-AUC analysis was performed for all the ten classes of the target variable (Grades A, A-, B+, B, B-, C+, C, C-, D, F). The ADA model with CTGAN data showed a mean AUC score of 0.4377, but the LR model with the Gaussian data showed a mean AUC score of 0.6149. ROC-AUC plots were obtained for each Grade value separately. The LR model with Gaussian data showed consistently better AUC scores compared to the ADA model with CTGAN data, except in two cases of the Grade value, C- and A-.

Keywords: machine learning, student success, physics course, grades, synthetic data, CTGAN, gaussian copula CTGAN

Procedia PDF Downloads 28
24007 Manganese Contamination Exacerbates Reproductive Stress in a Suicidally-Breeding Marsupial

Authors: Ami Fadhillah Amir Abdul Nasir, Amanda C. Niehaus, Skye F. Cameron, Frank A. Von Hippel, John Postlethwait​, Robbie S. Wilson

Abstract:

For suicidal breeders, the physiological stresses and energetic costs of breeding are fatal. Environmental stressors such as pollution should compound these costs, yet suicidal breeding is so rare among mammals that this is unknown. Here, we explored the consequences of metal contamination to the health, aging and performance of endangered, suicidally-breeding northern quolls (Dasyurus hallucatus) living near an active manganese mine on Groote Eylandt, Northern Territory, Australia. We found respirable manganese dust at levels exceeding international recommendations even 20km from mining sites and substantial accumulation of manganese within quolls’ hair, testes, and in two brain regions—the neocortex and cerebellum, responsible for sensory perception and motor function, respectively. Though quolls did not differ in sprint speeds, motor skill, or manoeuvrability, those with higher accumulation of manganese crashed at lower speeds during manoeuvrability tests, indicating a potential effect on sight or cognition. Immune function and telomere length declined over the breeding season, as expected with ageing, but manganese contamination exacerbated immune declines and suppressed cortisol. Unexpectedly, male quolls with higher levels of manganese had longer telomeres, supporting evidence of unusual telomere dynamics among Dasyurids—though whether this affects their lifespan is unknown. We posit that sublethal contamination via pollution, mining, or urbanisation imposes physiological costs on wildlife that may diminish reproductive success or survival.

Keywords: ecotoxicology, heavy metal, manganese, telomere length, cortisol, locomotor

Procedia PDF Downloads 294
24006 A Method for the Extraction of the Character's Tendency from Korean Novels

Authors: Min-Ha Hong, Kee-Won Kim, Seung-Hoon Kim

Abstract:

The character in the story-based content, such as novels and movies, is one of the core elements to understand the story. In particular, the character’s tendency is an important factor to analyze the story-based content, because it has a significant influence on the storyline. If readers have the knowledge of the tendency of characters before reading a novel, it will be helpful to understand the structure of conflict, episode and relationship between characters in the novel. It may therefore help readers to select novel that the reader wants to read. In this paper, we propose a method of extracting the tendency of the characters from a novel written in Korean. In advance, we build the dictionary with pairs of the emotional words in Korean and English since the emotion words in the novel’s sentences express character’s feelings. We rate the degree of polarity (positive or negative) of words in our emotional words dictionary based on SenticNet. Then we extract characters and emotion words from sentences in a novel. Since the polarity of a word grows strong or weak due to sentence features such as quotations and modifiers, our proposed method consider them to calculate the polarity of characters. The information of the extracted character’s polarity can be used in the book search service or book recommendation service.

Keywords: character tendency, data mining, emotion word, Korean novel

Procedia PDF Downloads 322