Search results for: estimation algorithms
Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 3842

Search results for: estimation algorithms

152 The Bidirectional Effect between Parental Burnout and the Child’s Internalized and/or Externalized Behaviors

Authors: Aline Woine, Moïra Mikolajczak, Virginie Dardier, Isabelle Roskam

Abstract:

Background information: Becoming a parent is said to be the happiest event one can ever experience in one’s life. This popular (and almost absolute) truth–which no reasonable and decent human being would ever dare question on pain of being singled out as a bad parent–contrasts with the nuances that reality offers. Indeed, while many parents do thrive in their parenting role, some others falter and become progressively overwhelmed by their parenting role, ineluctably caught in a spiral of exhaustion. Parental burnout (henceforth PB) sets in when parental demands (stressors) exceed parental resources. While it is now generally acknowledged that PB affects the parent’s behavior in terms of neglect and violence toward their offspring, little is known about the impact that the syndrome might have on the children’s internalized (anxious and depressive symptoms, somatic complaints, etc.) and/or externalized (irritability, violence, aggressiveness, conduct disorder, oppositional disorder, etc.) behaviors. Furthermore, at the time of writing, to our best knowledge, no research has yet tested the reverse effect, namely, that of the child's internalized and/or externalized behaviors on the onset and/or maintenance of parental burnout symptoms. Goals and hypotheses: The present pioneering research proposes to fill an important gap in the existing literature related to PB by investigating the bidirectional effect between PB and the child’s internalized and/or externalized behaviors. Relying on a cross-lagged longitudinal study with three waves of data collection (4 months apart), our study tests a transactional model with bidirectional and recursive relations between observed variables and at the three waves, as well as autoregressive paths and cross-sectional correlations. Methods: As we write this, wave-two data are being collected via Qualtrics, and we expect a final sample of about 600 participants composed of French-speaking (snowball sample) and English-speaking (Prolific sample) parents. Structural equation modeling is employed using Stata version 17. In order to retain as much statistical power as possible, we use all available data and therefore apply the maximum likelihood with a missing value (mlmv) as the method of estimation to compute the parameter estimates. To limit (in so far is possible) the shared method variance bias in the evaluation of the child’s behavior, the study relies on a multi-informant evaluation approach. Expected results: We expect our three-wave longitudinal study to show that PB symptoms (measured at T1) raise the occurrence/intensity of the child’s externalized and/or internalized behaviors (measured at T2 and T3). We further expect the child’s occurrence/intensity of externalized and/or internalized behaviors (measured at T1) to augment the risk for PB (measured at T2 and T3). Conclusion: Should our hypotheses be confirmed, our results will make an important contribution to the understanding of both PB and children’s behavioral issues, thereby opening interesting theoretical and clinical avenues.

Keywords: exhaustion, structural equation modeling, cross-lagged longitudinal study, violence and neglect, child-parent relationship

Procedia PDF Downloads 73
151 Mechanical Testing of Composite Materials for Monocoque Design in Formula Student Car

Authors: Erik Vassøy Olsen, Hirpa G. Lemu

Abstract:

Inspired by the Formula-1 competition, IMechE (Institute of Mechanical Engineers) and Formula SAE (Society of Mechanical Engineers) organize annual competitions for University and College students worldwide to compete with a single-seat race car they have designed and built. The design of the chassis or the frame is a key component of the competition because the weight and stiffness properties are directly related with the performance of the car and the safety of the driver. In addition, a reduced weight of the chassis has a direct influence on the design of other components in the car. Among others, it improves the power to weight ratio and the aerodynamic performance. As the power output of the engine or the battery installed in the car is limited to 80 kW, increasing the power to weight ratio demands reduction of the weight of the chassis, which represents the major part of the weight of the car. In order to reduce the weight of the car, ION Racing team from the University of Stavanger, Norway, opted for a monocoque design. To ensure fulfilment of the above-mentioned requirements of the chassis, the monocoque design should provide sufficient torsional stiffness and absorb the impact energy in case of a possible collision. The study reported in this article is based on the requirements for Formula Student competition. As part of this study, diverse mechanical tests were conducted to determine the mechanical properties and performances of the monocoque design. Upon a comprehensive theoretical study of the mechanical properties of sandwich composite materials and the requirements of monocoque design in the competition rules, diverse tests were conducted including 3-point bending test, perimeter shear test and test for absorbed energy. The test panels were homemade and prepared with an equivalent size of the side impact zone of the monocoque, i.e. 275 mm x 500 mm so that the obtained results from the tests can be representative. Different layups of the test panels with identical core material and the same number of layers of carbon fibre were tested and compared. Influence of the core material thickness was also studied. Furthermore, analytical calculations and numerical analysis were conducted to check compliance to the stated rules for Structural Equivalency with steel grade SAE/AISI 1010. The test results were also compared with calculated results with respect to bending and torsional stiffness, energy absorption, buckling, etc. The obtained results demonstrate that the material composition and strength of the composite material selected for the monocoque design has equivalent structural properties as a welded frame and thus comply with the competition requirements. The developed analytical calculation algorithms and relations will be useful for future monocoque designs with different lay-ups and compositions.

Keywords: composite material, Formula student, ION racing, monocoque design, structural equivalence

Procedia PDF Downloads 504
150 The Impact of Online Learning on Visual Learners

Authors: Ani Demetrashvili

Abstract:

As online learning continues to reshape the landscape of education, questions arise regarding its efficacy for diverse learning styles, particularly for visual learners. This abstract delves into the impact of online learning on visual learners, exploring how digital mediums influence their educational experience and how educational platforms can be optimized to cater to their needs. Visual learners comprise a significant portion of the student population, characterized by their preference for visual aids such as diagrams, charts, and videos to comprehend and retain information. Traditional classroom settings often struggle to accommodate these learners adequately, relying heavily on auditory and written forms of instruction. The advent of online learning presents both opportunities and challenges in addressing the needs of visual learners. Online learning platforms offer a plethora of multimedia resources, including interactive simulations, virtual labs, and video lectures, which align closely with the preferences of visual learners. These platforms have the potential to enhance engagement, comprehension, and retention by presenting information in visually stimulating formats. However, the effectiveness of online learning for visual learners hinges on various factors, including the design of learning materials, user interface, and instructional strategies. Research into the impact of online learning on visual learners encompasses a multidisciplinary approach, drawing from fields such as cognitive psychology, education, and human-computer interaction. Studies employ qualitative and quantitative methods to assess visual learners' preferences, cognitive processes, and learning outcomes in online environments. Surveys, interviews, and observational studies provide insights into learners' preferences for specific types of multimedia content and interactive features. Cognitive tasks, such as memory recall and concept mapping, shed light on the cognitive mechanisms underlying learning in digital settings. Eye-tracking studies offer valuable data on attentional patterns and information processing during online learning activities. The findings from research on the impact of online learning on visual learners have significant implications for educational practice and technology design. Educators and instructional designers can use insights from this research to create more engaging and effective learning materials for visual learners. Strategies such as incorporating visual cues, providing interactive activities, and scaffolding complex concepts with multimedia resources can enhance the learning experience for visual learners in online environments. Moreover, online learning platforms can leverage the findings to improve their user interface and features, making them more accessible and inclusive for visual learners. Customization options, adaptive learning algorithms, and personalized recommendations based on learners' preferences and performance can enhance the usability and effectiveness of online platforms for visual learners.

Keywords: online learning, visual learners, digital education, technology in learning

Procedia PDF Downloads 39
149 Measurement and Modelling of HIV Epidemic among High Risk Groups and Migrants in Two Districts of Maharashtra, India: An Application of Forecasting Software-Spectrum

Authors: Sukhvinder Kaur, Ashok Agarwal

Abstract:

Background: For the first time in 2009, India was able to generate estimates of HIV incidence (the number of new HIV infections per year). Analysis of epidemic projections helped in revealing that the number of new annual HIV infections in India had declined by more than 50% during the last decade (GOI Ministry of Health and Family Welfare, 2010). Then, National AIDS Control Organisation (NACO) planned to scale up its efforts in generating projections through epidemiological analysis and modelling by taking recent available sources of evidence such as HIV Sentinel Surveillance (HSS), India Census data and other critical data sets. Recently, NACO generated current round of HIV estimates-2012 through globally recommended tool “Spectrum Software” and came out with the estimates for adult HIV prevalence, annual new infections, number of people living with HIV, AIDS-related deaths and treatment needs. State level prevalence and incidence projections produced were used to project consequences of the epidemic in spectrum. In presence of HIV estimates generated at state level in India by NACO, USIAD funded PIPPSE project under the leadership of NACO undertook the estimations and projections to district level using same Spectrum software. In 2011, adult HIV prevalence in one of the high prevalent States, Maharashtra was 0.42% ahead of the national average of 0.27%. Considering the heterogeneity of HIV epidemic between districts, two districts of Maharashtra – Thane and Mumbai were selected to estimate and project the number of People-Living-with-HIV/AIDS (PLHIV), HIV-prevalence among adults and annual new HIV infections till 2017. Methodology: Inputs in spectrum included demographic data from Census of India since 1980 and sample registration system, programmatic data on ‘Alive and on ART (adult and children)’,‘Mother-Baby pairs under PPTCT’ and ‘High Risk Group (HRG)-size mapping estimates’, surveillance data from various rounds of HSS, National Family Health Survey–III, Integrated Biological and Behavioural Assessment and Behavioural Sentinel Surveillance. Major Findings: Assuming current programmatic interventions in these districts, an estimated decrease of 12% points in Thane and 31% points in Mumbai among new infections in HRGs and migrants is observed from 2011 by 2017. Conclusions: Project also validated decrease in HIV new infection among one of the high risk groups-FSWs using program cohort data since 2012 to 2016. Though there is a decrease in HIV prevalence and new infections in Thane and Mumbai, further decrease is possible if appropriate programme response, strategies and interventions are envisaged for specific target groups based on this evidence. Moreover, evidence need to be validated by other estimation/modelling techniques; and evidence can be generated for other districts of the state, where HIV prevalence is high and reliable data sources are available, to understand the epidemic within the local context.

Keywords: HIV sentinel surveillance, high risk groups, projections, new infections

Procedia PDF Downloads 211
148 Evolving Credit Scoring Models using Genetic Programming and Language Integrated Query Expression Trees

Authors: Alexandru-Ion Marinescu

Abstract:

There exist a plethora of methods in the scientific literature which tackle the well-established task of credit score evaluation. In its most abstract form, a credit scoring algorithm takes as input several credit applicant properties, such as age, marital status, employment status, loan duration, etc. and must output a binary response variable (i.e. “GOOD” or “BAD”) stating whether the client is susceptible to payment return delays. Data imbalance is a common occurrence among financial institution databases, with the majority being classified as “GOOD” clients (clients that respect the loan return calendar) alongside a small percentage of “BAD” clients. But it is the “BAD” clients we are interested in since accurately predicting their behavior is crucial in preventing unwanted loss for loan providers. We add to this whole context the constraint that the algorithm must yield an actual, tractable mathematical formula, which is friendlier towards financial analysts. To this end, we have turned to genetic algorithms and genetic programming, aiming to evolve actual mathematical expressions using specially tailored mutation and crossover operators. As far as data representation is concerned, we employ a very flexible mechanism – LINQ expression trees, readily available in the C# programming language, enabling us to construct executable pieces of code at runtime. As the title implies, they model trees, with intermediate nodes being operators (addition, subtraction, multiplication, division) or mathematical functions (sin, cos, abs, round, etc.) and leaf nodes storing either constants or variables. There is a one-to-one correspondence between the client properties and the formula variables. The mutation and crossover operators work on a flattened version of the tree, obtained via a pre-order traversal. A consequence of our chosen technique is that we can identify and discard client properties which do not take part in the final score evaluation, effectively acting as a dimensionality reduction scheme. We compare ourselves with state of the art approaches, such as support vector machines, Bayesian networks, and extreme learning machines, to name a few. The data sets we benchmark against amount to a total of 8, of which we mention the well-known Australian credit and German credit data sets, and the performance indicators are the following: percentage correctly classified, area under curve, partial Gini index, H-measure, Brier score and Kolmogorov-Smirnov statistic, respectively. Finally, we obtain encouraging results, which, although placing us in the lower half of the hierarchy, drive us to further refine the algorithm.

Keywords: expression trees, financial credit scoring, genetic algorithm, genetic programming, symbolic evolution

Procedia PDF Downloads 118
147 Development and Experimental Evaluation of a Semiactive Friction Damper

Authors: Juan S. Mantilla, Peter Thomson

Abstract:

Seismic events may result in discomfort on occupants of the buildings, structural damage or even buildings collapse. Traditional design aims to reduce dynamic response of structures by increasing stiffness, thus increasing the construction costs and the design forces. Structural control systems arise as an alternative to reduce these dynamic responses. A commonly used control systems in buildings are the passive friction dampers, which adds energy dissipation through damping mechanisms induced by sliding friction between their surfaces. Passive friction dampers are usually implemented on the diagonal of braced buildings, but such devices have the disadvantage that are optimal for a range of sliding force and out of that range its efficiency decreases. The above implies that each passive friction damper is designed, built and commercialized for a specific sliding/clamping force, in which the damper shift from a locked state to a slip state, where dissipates energy through friction. The risk of having a variation in the efficiency of the device according to the sliding force is that the dynamic properties of the building can change as result of many factor, even damage caused by a seismic event. In this case the expected forces in the building can change and thus considerably reduce the efficiency of the damper (that is designed for a specific sliding force). It is also evident than when a seismic event occurs the forces in each floor varies in the time what means that the damper's efficiency is not the best at all times. Semi-Active Friction devices adapt its sliding force trying to maintain its motion in the slipping phase as much as possible, because of this, the effectiveness of the device depends on the control strategy used. This paper deals with the development and performance evaluation of a low cost Semiactive Variable Friction Damper (SAVFD) in reduced scale to reduce vibrations of structures subject to earthquakes. The SAVFD consist in a (1) hydraulic brake adapted to (2) a servomotor which is controlled with an (3) Arduino board and acquires accelerations or displacement from (4) sensors in the immediately upper and lower floors and a (5) power supply that can be a pair of common batteries. A test structure, based on a Benchmark structure for structural control, was design and constructed. The SAVFD and the structure are experimentally characterized. A numerical model of the structure and the SAVFD is developed based on the dynamic characterization. Decentralized control algorithms were modeled and later tested experimentally using shaking table test using earthquake and frequency chirp signals. The controlled structure with the SAVFD achieved reductions greater than 80% in relative displacements and accelerations in comparison to the uncontrolled structure.

Keywords: earthquake response, friction damper, semiactive control, shaking table

Procedia PDF Downloads 378
146 Predicting Susceptibility to Coronary Artery Disease using Single Nucleotide Polymorphisms with a Large-Scale Data Extraction from PubMed and Validation in an Asian Population Subset

Authors: K. H. Reeta, Bhavana Prasher, Mitali Mukerji, Dhwani Dholakia, Sangeeta Khanna, Archana Vats, Shivam Pandey, Sandeep Seth, Subir Kumar Maulik

Abstract:

Introduction Research has demonstrated a connection between coronary artery disease (CAD) and genetics. We did a deep literature mining using both bioinformatics and manual efforts to identify the susceptible polymorphisms in coronary artery disease. Further, the study sought to validate these findings in an Asian population. Methodology In first phase, we used an automated pipeline which organizes and presents structured information on SNPs, Population and Diseases. The information was obtained by applying Natural Language Processing (NLP) techniques to approximately 28 million PubMed abstracts. To accomplish this, we utilized Python scripts to extract and curate disease-related data, filter out false positives, and categorize them into 24 hierarchical groups using named Entity Recognition (NER) algorithms. From the extensive research conducted, a total of 466 unique PubMed Identifiers (PMIDs) and 694 Single Nucleotide Polymorphisms (SNPs) related to coronary artery disease (CAD) were identified. To refine the selection process, a thorough manual examination of all the studies was carried out. Specifically, SNPs that demonstrated susceptibility to CAD and exhibited a positive Odds Ratio (OR) were selected, and a final pool of 324 SNPs was compiled. The next phase involved validating the identified SNPs in DNA samples of 96 CAD patients and 37 healthy controls from Indian population using Global Screening Array. ResultsThe results exhibited out of 324, only 108 SNPs were expressed, further 4 SNPs showed significant difference of minor allele frequency in cases and controls. These were rs187238 of IL-18 gene, rs731236 of VDR gene, rs11556218 of IL16 gene and rs5882 of CETP gene. Prior researches have reported association of these SNPs with various pathways like endothelial damage, susceptibility of vitamin D receptor (VDR) polymorphisms, and reduction of HDL-cholesterol levels, ultimately leading to the development of CAD. Among these, only rs731236 had been studied in Indian population and that too in diabetes and vitamin D deficiency. For the first time, these SNPs were reported to be associated with CAD in Indian population. Conclusion: This pool of 324 SNP s is a unique kind of resource that can help to uncover risk associations in CAD. Here, we validated in Indian population. Further, validation in different populations may offer valuable insights and contribute to the development of a screening tool and may help in enabling the implementation of primary prevention strategies targeted at the vulnerable population.

Keywords: coronary artery disease, single nucleotide polymorphism, susceptible SNP, bioinformatics

Procedia PDF Downloads 76
145 Optimizing Stormwater Sampling Design for Estimation of Pollutant Loads

Authors: Raja Umer Sajjad, Chang Hee Lee

Abstract:

Stormwater runoff is the leading contributor to pollution of receiving waters. In response, an efficient stormwater monitoring program is required to quantify and eventually reduce stormwater pollution. The overall goals of stormwater monitoring programs primarily include the identification of high-risk dischargers and the development of total maximum daily loads (TMDLs). The challenge in developing better monitoring program is to reduce the variability in flux estimates due to sampling errors; however, the success of monitoring program mainly depends on the accuracy of the estimates. Apart from sampling errors, manpower and budgetary constraints also influence the quality of the estimates. This study attempted to develop optimum stormwater monitoring design considering both cost and the quality of the estimated pollutants flux. Three years stormwater monitoring data (2012 – 2014) from a mix land use located within Geumhak watershed South Korea was evaluated. The regional climate is humid and precipitation is usually well distributed through the year. The investigation of a large number of water quality parameters is time-consuming and resource intensive. In order to identify a suite of easy-to-measure parameters to act as a surrogate, Principal Component Analysis (PCA) was applied. Means, standard deviations, coefficient of variation (CV) and other simple statistics were performed using multivariate statistical analysis software SPSS 22.0. The implication of sampling time on monitoring results, number of samples required during the storm event and impact of seasonal first flush were also identified. Based on the observations derived from the PCA biplot and the correlation matrix, total suspended solids (TSS) was identified as a potential surrogate for turbidity, total phosphorus and for heavy metals like lead, chromium, and copper whereas, Chemical Oxygen Demand (COD) was identified as surrogate for organic matter. The CV among different monitored water quality parameters were found higher (ranged from 3.8 to 15.5). It suggests that use of grab sampling design to estimate the mass emission rates in the study area can lead to errors due to large variability. TSS discharge load calculation error was found only 2 % with two different sample size approaches; i.e. 17 samples per storm event and equally distributed 6 samples per storm event. Both seasonal first flush and event first flush phenomena for most water quality parameters were observed in the study area. Samples taken at the initial stage of storm event generally overestimate the mass emissions; however, it was found that collecting a grab sample after initial hour of storm event more closely approximates the mean concentration of the event. It was concluded that site and regional climate specific interventions can be made to optimize the stormwater monitoring program in order to make it more effective and economical.

Keywords: first flush, pollutant load, stormwater monitoring, surrogate parameters

Procedia PDF Downloads 240
144 Generation of Knowlege with Self-Learning Methods for Ophthalmic Data

Authors: Klaus Peter Scherer, Daniel Knöll, Constantin Rieder

Abstract:

Problem and Purpose: Intelligent systems are available and helpful to support the human being decision process, especially when complex surgical eye interventions are necessary and must be performed. Normally, such a decision support system consists of a knowledge-based module, which is responsible for the real assistance power, given by an explanation and logical reasoning processes. The interview based acquisition and generation of the complex knowledge itself is very crucial, because there are different correlations between the complex parameters. So, in this project (semi)automated self-learning methods are researched and developed for an enhancement of the quality of such a decision support system. Methods: For ophthalmic data sets of real patients in a hospital, advanced data mining procedures seem to be very helpful. Especially subgroup analysis methods are developed, extended and used to analyze and find out the correlations and conditional dependencies between the structured patient data. After finding causal dependencies, a ranking must be performed for the generation of rule-based representations. For this, anonymous patient data are transformed into a special machine language format. The imported data are used as input for algorithms of conditioned probability methods to calculate the parameter distributions concerning a special given goal parameter. Results: In the field of knowledge discovery advanced methods and applications could be performed to produce operation and patient related correlations. So, new knowledge was generated by finding causal relations between the operational equipment, the medical instances and patient specific history by a dependency ranking process. After transformation in association rules logically based representations were available for the clinical experts to evaluate the new knowledge. The structured data sets take account of about 80 parameters as special characteristic features per patient. For different extended patient groups (100, 300, 500), as well one target value as well multi-target values were set for the subgroup analysis. So the newly generated hypotheses could be interpreted regarding the dependency or independency of patient number. Conclusions: The aim and the advantage of such a semi-automatically self-learning process are the extensions of the knowledge base by finding new parameter correlations. The discovered knowledge is transformed into association rules and serves as rule-based representation of the knowledge in the knowledge base. Even more, than one goal parameter of interest can be considered by the semi-automated learning process. With ranking procedures, the most strong premises and also conjunctive associated conditions can be found to conclude the interested goal parameter. So the knowledge, hidden in structured tables or lists can be extracted as rule-based representation. This is a real assistance power for the communication with the clinical experts.

Keywords: an expert system, knowledge-based support, ophthalmic decision support, self-learning methods

Procedia PDF Downloads 253
143 Identity and Mental Adaptation of Deaf and Hard-of-Hearing Students

Authors: N. F. Mikhailova, M. E. Fattakhova, M. A. Mironova, E. V. Vyacheslavova

Abstract:

For the mental and social adaptation of the deaf and hard-of-hearing people, cultural and social aspects - the formation of identity (acculturation) and educational conditions – are highly significant. We studied 137 deaf and hard-of-hearing students in different educational situations. We used these methods: Big Five (Costa & McCrae, 1997), TRF (Becker, 1989), WCQ (Lazarus & Folkman, 1988), self-esteem, and coping strategies (Jambor & Elliott, 2005), self-stigma scale (Mikhailov, 2008). Type of self-identification of students depended on the degree of deafness, type of education, method of communication in the family: large hearing loss, education in schools for deaf, and gesture communication increased the likelihood of a 'deaf' acculturation. Less hearing loss, inclusive education in public school or school for the hearing-impaired, mixed communication in the family contributed to the formation of 'hearing' acculturation. The choice of specific coping depended on the degree of deafness: a large hearing loss increased coping 'withdrawal into the deaf world' and decreased 'bicultural skills' coping. People with mild hearing loss tended to cover-up it. In the context of ongoing discussion, we researched personality characteristics in deaf and hard on-hearing students, coping and other deafness associated factors depending on their acculturation type. Students who identified themselves with the 'hearing world' had a high self-esteem, a higher level of extraversion, self-awareness, personal resources, willingness to cooperate, better psychological health, emotional stability, higher ability to empathy, a greater satiety of life with feelings and sense and high sense of self-worth. They also actively used strategies, problem-solving, acceptance of responsibility, positive revaluation. Student who limited themselves within the culture of deaf people had more severe hearing loss and accordingly had more communication barriers. Lack of use or seldom use of coping strategies by these students point at decreased level of stress in their life. Their self-esteem have not been challenged in the specific social environment of the students with the same severity of defect, and thus this environment provided sense of comfort (we can assume that from the high scores on psychological health, personality resources, and emotional stability). Students with bicultural acculturation had higher level of psychological resources - they used Positive Reappraisal coping more often and had a higher level of psychological health. Lack of belonging to certain culture (marginality) leads to personality disintegration, social and psychological disadaptation: deaf and hard-of-hearing students with marginal identification had a lower self-estimation level, worse psychological health and personal resources, lower level of extroversion, self-confidence and life satisfaction. They, in fact, become 'risk group' (many of them dropped out of universities, divorced, and one even ended up in the ranks of ISIS). All these data argue the importance of cultural 'anchor' for people with hearing deprivation. Supported by the RFBR No 19-013-00406.

Keywords: acculturation, coping, deafness, marginality

Procedia PDF Downloads 204
142 Ethnic Identity as an Asset: Linking Ethnic Identity, Perceived Social Support, and Mental Health among Indigenous Adults in Taiwan

Authors: A.H.Y. Lai, C. Teyra

Abstract:

In Taiwan, there are 16 official indigenous groups, accounting for 2.3% of the total population. Like other indigenous populations worldwide, indigenous peoples in Taiwan have poorer mental health because of their history of oppression and colonisation. Amid the negative narratives, the ethnic identity of cultural minorities is their unique psychological and cultural asset. Moreover, positive socialisation is found to be related to strong ethnic identity. Based on Phinney’s theory on ethnic identity development and social support theory, this study adopted a strength-based approach conceptualising ethnic identity as the central organising principle that linked perceived social support and mental health among indigenous adults in Taiwan. Aims. Overall aim is to examine the effect of ethnic identity and social support on mental health. Specific aims were to examine : (1) the association between ethnic identity and mental health; (2) the association between perceived social support and mental health ; (3) the indirect effect of ethnic identity linking perceived social support and mental health. Methods. Participants were indigenous adults in Taiwan (n=200; mean age=29.51; Female=31%, Male=61%, Others=8%). A cross-sectional quantitative design was implemented using data collected in the year 2020. Respondent-driven sampling was used. Standardised measurements were: Ethnic Identity Scale(6-item); Social Support Questionnaire-SF(6 items); Patient Health Questionnaire(9-item); and Generalised Anxiety Disorder(7-item). Covariates were age, gender and economic satisfaction. A four-stage structural equation modelling (SEM) with robust maximin likelihood estimation was employed using Mplus8.0. Step 1: A measurement model was built and tested using confirmatory factor analysis (CFA). Step 2: Factor covariates were re-specified as direct effects in the SEM. Covariates were added. The direct effects of (1) ethnic identity and social support on depression and anxiety and (2) social support on ethnic identity were tested. The indirect effect of ethnic identity was examined with the bootstrapping technique. Results. The CFA model showed satisfactory fit statistics: x^2(df)=869.69(608), p<.05; Comparative ft index (CFI)/ Tucker-Lewis fit index (TLI)=0.95/0.94; root mean square error of approximation (RMSEA)=0.05; Standardized Root Mean Squared Residual (SRMR)=0.05. Ethnic identity is represented by two latent factors: ethnic identity-commitment and ethnic identity-exploration. Depression, anxiety and social support are single-factor latent variables. For the SEM, model fit statistics were: x^2(df)=779.26(527), p<.05; CFI/TLI=0.94/0.93; RMSEA=0.05; SRMR=0.05. Ethnic identity-commitment (b=-0.30) and social support (b=-0.33) had direct negative effects on depression, but ethnic identity-exploration did not. Ethnic identity-commitment (b=-0.43) and social support (b=-0.31) had direct negative effects on anxiety, while identity-exploration (b=0.24) demonstrated a positive effect. Social support had direct positive effects on ethnic identity-exploration (b=0.26) and ethnic identity-commitment (b=0.31). Mediation analysis demonstrated the indirect effect of ethnic identity-commitment linking social support and depression (b=0.22). Implications: Results underscore the role of social support in preventing depression via ethnic identity commitment among indigenous adults in Taiwan. Adopting the strength-based approach, mental health practitioners can mobilise indigenous peoples’ commitment to their group to promote their well-being.

Keywords: ethnic identity, indigenous population, mental health, perceived social support

Procedia PDF Downloads 103
141 Contextual Factors of Innovation for Improving Commercial Banks' Performance in Nigeria

Authors: Tomola Obamuyi

Abstract:

The banking system in Nigeria adopted innovative banking, with the aim of enhancing financial inclusion, and making financial services readily and cheaply available to majority of the people, and to contribute to the efficiency of the financial system. Some of the innovative services include: Automatic Teller Machines (ATMs), National Electronic Fund Transfer (NEFT), Point of Sale (PoS), internet (Web) banking, Mobile Money payment (MMO), Real-Time Gross Settlement (RTGS), agent banking, among others. The introduction of these payment systems is expected to increase bank efficiency and customers' satisfaction, culminating in better performance for the commercial banks. However, opinions differ on the possible effects of the various innovative payment systems on the performance of commercial banks in the country. Thus, this study empirically determines how commercial banks use innovation to gain competitive advantage in the specific context of Nigeria's finance and business. The study also analyses the effects of financial innovation on the performance of commercial banks, when different periods of analysis are considered. The study employed secondary data from 2009 to 2018, the period that witnessed aggressive innovation in the financial sector of the country. The Vector Autoregression (VAR) estimation technique forecasts the relative variance of each random innovation to the variables in the VAR, examine the effect of standard deviation shock to one of the innovations on current and future values of the impulse response and determine the causal relationship between the variables (VAR granger causality test). The study also employed the Multi-Criteria Decision Making (MCDM) to rank the innovations and the performance criteria of Return on Assets (ROA) and Return on Equity (ROE). The entropy method of MCDM was used to determine which of the performance criteria better reflect the contributions of the various innovations in the banking sector. On the other hand, the Range of Values (ROV) method was used to rank the contributions of the seven innovations to performance. The analysis was done based on medium term (five years) and long run (ten years) of innovations in the sector. The impulse response function derived from the VAR system indicated that the response of ROA to the values of cheques transaction, values of NEFT transactions, values of POS transactions was positive and significant in the periods of analysis. The paper also confirmed with entropy and range of value that, in the long run, both the CHEQUE and MMO performed best while NEFT was next in performance. The paper concluded that commercial banks would enhance their performance by continuously improving on the services provided through Cheques, National Electronic Fund Transfer and Point of Sale since these instruments have long run effects on their performance. This will increase the confidence of the populace and encourage more usage/patronage of these services. The banking sector will in turn experience better performance which will improve the economy of the country. Keywords: Bank performance, financial innovation, multi-criteria decision making, vector autoregression,

Keywords: Bank performance, financial innovation, multi-criteria decision making, vector autoregression

Procedia PDF Downloads 121
140 Behavioral Patterns of Adopting Digitalized Services (E-Sport versus Sports Spectating) Using Agent-Based Modeling

Authors: Justyna P. Majewska, Szymon M. Truskolaski

Abstract:

The growing importance of digitalized services in the so-called new economy, including the e-sports industry, can be observed recently. Various demographic or technological changes lead consumers to modify their needs, not regarding the services themselves but the method of their application (attracting customers, forms of payment, new content, etc.). In the case of leisure-related to competitive spectating activities, there is a growing need to participate in events whose content is not sports competitions but computer games challenge – e-sport. The literature in this area so far focuses on determining the number of e-sport fans with elements of a simple statistical description (mainly concerning demographic characteristics such as age, gender, place of residence). Meanwhile, the development of the industry is influenced by a combination of many different, intertwined demographic, personality and psychosocial characteristics of customers, as well as the characteristics of their environment. Therefore, there is a need for a deeper recognition of the determinants of the behavioral patterns upon selecting digitalized services by customers, which, in the absence of available large data sets, can be achieved by using econometric simulations – multi-agent modeling. The cognitive aim of the study is to reveal internal and external determinants of behavioral patterns of customers taking into account various variants of economic development (the pace of digitization and technological development, socio-demographic changes, etc.). In the paper, an agent-based model with heterogeneous agents (characteristics of customers themselves and their environment) was developed, which allowed identifying a three-stage development scenario: i) initial interest, ii) standardization, and iii) full professionalization. The probabilities regarding the transition process were estimated using the Method of Simulated Moments. The estimation of the agent-based model parameters and sensitivity analysis reveals crucial factors that have driven a rising trend in e-sport spectating and, in a wider perspective, the development of digitalized services. Among the psychosocial characteristics of customers, they are the level of familiarization with the rules of games as well as sports disciplines, active and passive participation history and individual perception of challenging activities. Environmental factors include general reception of games, number and level of recognition of community builders and the level of technological development of streaming as well as community building platforms. However, the crucial factor underlying the good predictive power of the model is the level of professionalization. While in the initial interest phase, the entry barriers for new customers are high. They decrease during the phase of standardization and increase again in the phase of full professionalization when new customers perceive participation history inaccessible. In this case, they are prone to switch to new methods of service application – in the case of e-sport vs. sports to new content and more modern methods of its delivery. In a wider context, the findings in the paper support the idea of a life cycle of services regarding methods of their application from “traditional” to digitalized.

Keywords: agent-based modeling, digitalized services, e-sport, spectators motives

Procedia PDF Downloads 172
139 Water Ingress into Underground Mine Voids in the Central Rand Goldfields Area, South Africa-Fluid Induced Seismicity

Authors: Artur Cichowicz

Abstract:

The last active mine in the Central Rand Goldfields area (50 km x 15 km) ceased operations in 2008. This resulted in the closure of the pumping stations, which previously maintained the underground water level in the mining voids. As a direct consequence of the water being allowed to flood the mine voids, seismic activity has increased directly beneath the populated area of Johannesburg. Monitoring of seismicity in the area has been on-going for over five years using the network of 17 strong ground motion sensors. The objective of the project is to improve strategies for mine closure. The evolution of the seismicity pattern was investigated in detail. Special attention was given to seismic source parameters such as magnitude, scalar seismic moment and static stress drop. Most events are located within historical mine boundaries. The seismicity pattern shows a strong relationship between the presence of the mining void and high levels of seismicity; no seismicity migration patterns were observed outside the areas of old mining. Seven years after the pumping stopped, the evolution of the seismicity has indicated that the area is not yet in equilibrium. The level of seismicity in the area appears to not be decreasing over time since the number of strong events, with Mw magnitudes above 2, is still as high as it was when monitoring began over five years ago. The average rate of seismic deformation is 1.6x1013 Nm/year. Constant seismic deformation was not observed over the last 5 years. The deviation from the average is in the order of 6x10^13 Nm/year, which is a significant deviation. The variation of cumulative seismic moment indicates that a constant deformation rate model is not suitable. Over the most recent five year period, the total cumulative seismic moment released in the Central Rand Basin was 9.0x10^14 Nm. This is equivalent to one earthquake of magnitude 3.9. This is significantly less than what was experienced during the mining operation. Characterization of seismicity triggered by a rising water level in the area can be achieved through the estimation of source parameters. Static stress drop heavily influences ground motion amplitude, which plays an important role in risk assessments of potential seismic hazards in inhabited areas. The observed static stress drop in this study varied from 0.05 MPa to 10 MPa. It was found that large static stress drops could be associated with both small and large events. The temporal evolution of the inter-event time provides an understanding of the physical mechanisms of earthquake interaction. Changes in the characteristics of the inter-event time are produced when a stress change is applied to a group of faults in the region. Results from this study indicate that the fluid-induced source has a shorter inter-event time in comparison to a random distribution. This behaviour corresponds to a clustering of events, in which short recurrence times tend to be close to each other, forming clusters of events.

Keywords: inter-event time, fluid induced seismicity, mine closure, spectral parameters of seismic source

Procedia PDF Downloads 285
138 Localized Recharge Modeling of a Coastal Aquifer from a Dam Reservoir (Korba, Tunisia)

Authors: Nejmeddine Ouhichi, Fethi Lachaal, Radhouane Hamdi, Olivier Grunberger

Abstract:

Located in Cap Bon peninsula (Tunisia), the Lebna dam was built in 1987 to balance local water salt intrusion taking place in the coastal aquifer of Korba. The first intention was to reduce coastal groundwater over-pumping by supplying surface water to a large irrigation system. The unpredicted beneficial effect was recorded with the occurrence of a direct localized recharge to the coastal aquifer by leakage through the geological material of the southern bank of the lake. The hydrological balance of the reservoir dam gave an estimation of the annual leakage volume, but dynamic processes and sound quantification of recharge inputs are still required to understand the localized effect of the recharge in terms of piezometry and quality. Present work focused on simulating the recharge process to confirm the hypothesis, and established a sound quantification of the water supply to the coastal aquifer and extend it to multi-annual effects. A spatial frame of 30km² was used for modeling. Intensive outcrops and geophysical surveys based on 68 electrical resistivity soundings were used to characterize the aquifer 3D geometry and the limit of the Plio-quaternary geological material concerned by the underground flow paths. Permeabilities were determined using 17 pumping tests on wells and piezometers. Six seasonal piezometric surveys on 71 wells around southern reservoir dam banks were performed during the 2019-2021 period. Eight monitoring boreholes of high frequency (15min) piezometric data were used to examine dynamical aspects. Model boundary conditions were specified using the geophysics interpretations coupled with the piezometric maps. The dam-groundwater flow model was performed using Visual MODFLOW software. Firstly, permanent state calibration based on the first piezometric map of February 2019 was established to estimate the permanent flow related to the different reservoir levels. Secondly, piezometric data for the 2019-2021 period were used for transient state calibration and to confirm the robustness of the model. Preliminary results confirmed the temporal link between the reservoir level and the localized recharge flow with a strong threshold effect for levels below 16 m.a.s.l. The good agreement of computed flow through recharge cells on the southern banks and hydrological budget of the reservoir open the path to future simulation scenarios of the dilution plume imposed by the localized recharge. The dam reservoir-groundwater flow-model simulation results approve a potential for storage of up to 17mm/year in existing wells, under gravity-feed conditions during level increases on the reservoir into the three years of operation. The Lebna dam groundwater flow model characterized a spatiotemporal relation between groundwater and surface water.

Keywords: leakage, MODFLOW, saltwater intrusion, surface water-groundwater interaction

Procedia PDF Downloads 138
137 Application of Deep Learning Algorithms in Agriculture: Early Detection of Crop Diseases

Authors: Manaranjan Pradhan, Shailaja Grover, U. Dinesh Kumar

Abstract:

Farming community in India, as well as other parts of the world, is one of the highly stressed communities due to reasons such as increasing input costs (cost of seeds, fertilizers, pesticide), droughts, reduced revenue leading to farmer suicides. Lack of integrated farm advisory system in India adds to the farmers problems. Farmers need right information during the early stages of crop’s lifecycle to prevent damage and loss in revenue. In this paper, we use deep learning techniques to develop an early warning system for detection of crop diseases using images taken by farmers using their smart phone. The research work leads to building a smart assistant using analytics and big data which could help the farmers with early diagnosis of the crop diseases and corrective actions. The classical approach for crop disease management has been to identify diseases at crop level. Recently, ImageNet Classification using the convolutional neural network (CNN) has been successfully used to identify diseases at individual plant level. Our model uses convolution filters, max pooling, dense layers and dropouts (to avoid overfitting). The models are built for binary classification (healthy or not healthy) and multi class classification (identifying which disease). Transfer learning is used to modify the weights of parameters learnt through ImageNet dataset and apply them on crop diseases, which reduces number of epochs to learn. One shot learning is used to learn from very few images, while data augmentation techniques are used to improve accuracy with images taken from farms by using techniques such as rotation, zoom, shift and blurred images. Models built using combination of these techniques are more robust for deploying in the real world. Our model is validated using tomato crop. In India, tomato is affected by 10 different diseases. Our model achieves an accuracy of more than 95% in correctly classifying the diseases. The main contribution of our research is to create a personal assistant for farmers for managing plant disease, although the model was validated using tomato crop, it can be easily extended to other crops. The advancement of technology in computing and availability of large data has made possible the success of deep learning applications in computer vision, natural language processing, image recognition, etc. With these robust models and huge smartphone penetration, feasibility of implementation of these models is high resulting in timely advise to the farmers and thus increasing the farmers' income and reducing the input costs.

Keywords: analytics in agriculture, CNN, crop disease detection, data augmentation, image recognition, one shot learning, transfer learning

Procedia PDF Downloads 120
136 Nature of Forest Fragmentation Owing to Human Population along Elevation Gradient in Different Countries in Hindu Kush Himalaya Mountains

Authors: Pulakesh Das, Mukunda Dev Behera, Manchiraju Sri Ramachandra Murthy

Abstract:

Large numbers of people living in and around the Hindu Kush Himalaya (HKH) region, depends on this diverse mountainous region for ecosystem services. Following the global trend, this region also experiencing rapid population growth, and demand for timber and agriculture land. The eight countries sharing the HKH region have different forest resources utilization and conservation policies that exert varying forces in the forest ecosystem. This created a variable spatial as well altitudinal gradient in rate of deforestation and corresponding forest patch fragmentation. The quantitative relationship between fragmentation and demography has not been established before for HKH vis-à-vis along elevation gradient. This current study was carried out to attribute the overall and different nature in landscape fragmentations along the altitudinal gradient with the demography of each sharing countries. We have used the tree canopy cover data derived from Landsat data to analyze the deforestation and afforestation rate, and corresponding landscape fragmentation observed during 2000 – 2010. Area-weighted mean radius of gyration (AMN radius of gyration) was computed owing to its advantage as spatial indicator of fragmentation over non-spatial fragmentation indices. Using the subtraction method, the change in fragmentation was computed during 2000 – 2010. Using the tree canopy cover data as a surrogate of forest cover, highest forest loss was observed in Myanmar followed by China, India, Bangladesh, Nepal, Pakistan, Bhutan, and Afghanistan. However, the sequence of fragmentation was different after the maximum fragmentation observed in Myanmar followed by India, China, Bangladesh, and Bhutan; whereas increase in fragmentation was seen following the sequence of as Nepal, Pakistan, and Afghanistan. Using SRTM-derived DEM, we observed higher rate of fragmentation up to 2400m that corroborated with high human population for the year 2000 and 2010. To derive the nature of fragmentation along the altitudinal gradients, the Statistica software was used, where the user defined function was utilized for regression applying the Gauss-Newton estimation method with 50 iterations. We observed overall logarithmic decrease in fragmentation change (area-weighted mean radius of gyration), forest cover loss and population growth during 2000-2010 along the elevation gradient with very high R2 values (i.e., 0.889, 0.895, 0.944 respectively). The observed negative logarithmic function with the major contribution in the initial elevation gradients suggest to gap filling afforestation in the lower altitudes to enhance the forest patch connectivity. Our finding on the pattern of forest fragmentation and human population across the elevation gradient in HKH region will have policy level implication for different nations and would help in characterizing hotspots of change. Availability of free satellite derived data products on forest cover and DEM, grid-data on demography, and utility of geospatial tools helped in quick evaluation of the forest fragmentation vis-a-vis human impact pattern along the elevation gradient in HKH.

Keywords: area-weighted mean radius of gyration, fragmentation, human impact, tree canopy cover

Procedia PDF Downloads 215
135 Strategies for the Optimization of Ground Resistance in Large Scale Foundations for Optimum Lightning Protection

Authors: Oibar Martinez, Clara Oliver, Jose Miguel Miranda

Abstract:

In this paper, we discuss the standard improvements which can be made to reduce the earth resistance in difficult terrains for optimum lightning protection, what are the practical limitations, and how the modeling can be refined for accurate diagnostics and ground resistance minimization. Ground resistance minimization can be made via three different approaches: burying vertical electrodes connected in parallel, burying horizontal conductive plates or meshes, or modifying the own terrain, either by changing the entire terrain material in a large volume or by adding earth-enhancing compounds. The use of vertical electrodes connected in parallel pose several practical limitations. In order to prevent loss of effectiveness, it is necessary to keep a minimum distance between each electrode, which is typically around five times larger than the electrode length. Otherwise, the overlapping of the local equipotential lines around each electrode reduces the efficiency of the configuration. The addition of parallel electrodes reduces the resistance and facilitates the measurement, but the basic parallel resistor formula of circuit theory will always underestimate the final resistance. Numerical simulation of equipotential lines around the electrodes overcomes this limitation. The resistance of a single electrode will always be proportional to the soil resistivity. The electrodes are usually installed with a backfilling material of high conductivity, which increases the effective diameter. However, the improvement is marginal, since the electrode diameter counts in the estimation of the ground resistance via a logarithmic function. Substances that are used for efficient chemical treatment must be environmentally friendly and must feature stability, high hygroscopicity, low corrosivity, and high electrical conductivity. A number of earth enhancement materials are commercially available. Many are comprised of carbon-based materials or clays like bentonite. These materials can also be used as backfilling materials to reduce the resistance of an electrode. Chemical treatment of soil has environmental issues. Some products contain copper sulfate or other copper-based compounds, which may not be environmentally friendly. Carbon-based compounds are relatively inexpensive and they do have very low resistivities, but they also feature corrosion issues. Typically, the carbon can corrode and destroy a copper electrode in around five years. These compounds also have potential environmental concerns. Some earthing enhancement materials contain cement, which, after installation acquire properties that are very close to concrete. This prevents the earthing enhancement material from leaching into the soil. After analyzing different configurations, we conclude that a buried conductive ring with vertical electrodes connected periodically should be the optimum baseline solution for the grounding of a large size structure installed on a large resistivity terrain. In order to show this, a practical example is explained here where we simulate the ground resistance of a conductive ring buried in a terrain with a resistivity in the range of 1 kOhm·m.

Keywords: grounding improvements, large scale scientific instrument, lightning risk assessment, lightning standards

Procedia PDF Downloads 139
134 Control of Belts for Classification of Geometric Figures by Artificial Vision

Authors: Juan Sebastian Huertas Piedrahita, Jaime Arturo Lopez Duque, Eduardo Luis Perez Londoño, Julián S. Rodríguez

Abstract:

The process of generating computer vision is called artificial vision. The artificial vision is a branch of artificial intelligence that allows the obtaining, processing, and analysis of any type of information especially the ones obtained through digital images. Actually the artificial vision is used in manufacturing areas for quality control and production, as these processes can be realized through counting algorithms, positioning, and recognition of objects that can be measured by a single camera (or more). On the other hand, the companies use assembly lines formed by conveyor systems with actuators on them for moving pieces from one location to another in their production. These devices must be previously programmed for their good performance and must have a programmed logic routine. Nowadays the production is the main target of every industry, quality, and the fast elaboration of the different stages and processes in the chain of production of any product or service being offered. The principal base of this project is to program a computer that recognizes geometric figures (circle, square, and triangle) through a camera, each one with a different color and link it with a group of conveyor systems to organize the mentioned figures in cubicles, which differ from one another also by having different colors. This project bases on artificial vision, therefore the methodology needed to develop this project must be strict, this one is detailed below: 1. Methodology: 1.1 The software used in this project is QT Creator which is linked with Open CV libraries. Together, these tools perform to realize the respective program to identify colors and forms directly from the camera to the computer. 1.2 Imagery acquisition: To start using the libraries of Open CV is necessary to acquire images, which can be captured by a computer’s web camera or a different specialized camera. 1.3 The recognition of RGB colors is realized by code, crossing the matrices of the captured images and comparing pixels, identifying the primary colors which are red, green, and blue. 1.4 To detect forms it is necessary to realize the segmentation of the images, so the first step is converting the image from RGB to grayscale, to work with the dark tones of the image, then the image is binarized which means having the figure of the image in a white tone with a black background. Finally, we find the contours of the figure in the image to detect the quantity of edges to identify which figure it is. 1.5 After the color and figure have been identified, the program links with the conveyor systems, which through the actuators will classify the figures in their respective cubicles. Conclusions: The Open CV library is a useful tool for projects in which an interface between a computer and the environment is required since the camera obtains external characteristics and realizes any process. With the program for this project any type of assembly line can be optimized because images from the environment can be obtained and the process would be more accurate.

Keywords: artificial intelligence, artificial vision, binarized, grayscale, images, RGB

Procedia PDF Downloads 379
133 Forest Fire Burnt Area Assessment in a Part of West Himalayan Region Using Differenced Normalized Burnt Ratio and Neural Network Approach

Authors: Sunil Chandra, Himanshu Rawat, Vikas Gusain, Triparna Barman

Abstract:

Forest fires are a recurrent phenomenon in the Himalayan region owing to the presence of vulnerable forest types, topographical gradients, climatic weather conditions, and anthropogenic pressure. The present study focuses on the identification of forest fire-affected areas in a small part of the West Himalayan region using a differential normalized burnt ratio method and spectral unmixing methods. The study area has a rugged terrain with the presence of sub-tropical pine forest, montane temperate forest, and sub-alpine forest and scrub. The major reason for fires in this region is anthropogenic in nature, with the practice of human-induced fires for getting fresh leaves, scaring wild animals to protect agricultural crops, grazing practices within reserved forests, and igniting fires for cooking and other reasons. The fires caused by the above reasons affect a large area on the ground, necessitating its precise estimation for further management and policy making. In the present study, two approaches have been used for carrying out a burnt area analysis. The first approach followed for burnt area analysis uses a differenced normalized burnt ratio (dNBR) index approach that uses the burnt ratio values generated using the Short-Wave Infrared (SWIR) band and Near Infrared (NIR) bands of the Sentinel-2 image. The results of the dNBR have been compared with the outputs of the spectral mixing methods. It has been found that the dNBR is able to create good results in fire-affected areas having homogenous forest stratum and with slope degree <5 degrees. However, in a rugged terrain where the landscape is largely influenced by the topographical variations, vegetation types, tree density, the results may be largely influenced by the effects of topography, complexity in tree composition, fuel load composition, and soil moisture. Hence, such variations in the factors influencing burnt area assessment may not be effectively carried out using a dNBR approach which is commonly followed for burnt area assessment over a large area. Hence, another approach that has been attempted in the present study utilizes a spectral mixing method where the individual pixel is tested before assigning an information class to it. The method uses a neural network approach utilizing Sentinel-2 bands. The training and testing data are generated from the Sentinel-2 data and the national field inventory, which is further used for generating outputs using ML tools. The analysis of the results indicates that the fire-affected regions and their severity can be better estimated using spectral unmixing methods, which have the capability to resolve the noise in the data and can classify the individual pixel to the precise burnt/unburnt class.

Keywords: categorical data, log linear modeling, neural network, shifting cultivation

Procedia PDF Downloads 56
132 A Shift in Approach from Cereal Based Diet to Dietary Diversity in India: A Case Study of Aligarh District

Authors: Abha Gupta, Deepak K. Mishra

Abstract:

Food security issue in India has surrounded over availability and accessibility of cereal which is regarded as the only food group to check hunger and improve nutrition. Significance of fruits, vegetables, meat and other food products have totally been neglected given the fact that they provide essential nutrients to the body. There is a need to shift the emphasis from cereal-based approach to a more diverse diet so that aim of achieving food security may change from just reducing hunger to an overall health. This paper attempts to analyse how far dietary diversity level has been achieved across different socio-economic groups in India. For this purpose, present paper sets objectives to determine (a) percentage share of different food groups to total food expenditure and consumption by background characteristics (b) source of and preference for all food items and, (c) diversity of diet across socio-economic groups. A cross sectional survey covering 304 households selected through proportional stratified random sampling was conducted in six villages of Aligarh district of Uttar Pradesh, India. Information on amount of food consumed, source of consumption and expenditure on food (74 food items grouped into 10 major food groups) was collected with a recall period of seven days. Per capita per day food consumption/expenditure was calculated through dividing consumption/expenditure by household size and number seven. Food variety score was estimated by giving 0 values to those food groups/items which had not been eaten and 1 to those which had been taken by households in last seven days. Addition of all food group/item score gave result of food variety score. Diversity of diet was computed using Herfindahl-Hirschman index. Findings of the paper show that cereal, milk, roots and tuber food groups contribute a major share in total consumption/expenditure. Consumption of these food groups vary across socio-economic groups whereas fruit, vegetables, meat and other food consumption remain low and same. Estimation of dietary diversity show higher concentration of diet due to higher consumption of cereals, milk, root and tuber products and dietary diversity slightly varies across background groups. Muslims, Scheduled caste, small farmers, lower income class, food insecure, below poverty line and labour families show higher concentration of diet as compared to their counterpart groups. These groups also evince lower mean intake of number of food item in a week due to poor economic constraints and resultant lower accessibility to number of expensive food items. Results advocate to make a shift from cereal based diet to dietary diversity which not only includes cereal and milk products but also nutrition rich food items such as fruits, vegetables, meat and other products. Integrating a dietary diversity approach in food security programmes of the country would help to achieve nutrition security as hidden hunger is widespread among the Indian population.

Keywords: dietary diversity, food Security, India, socio-economic groups

Procedia PDF Downloads 340
131 Machine Learning and Internet of Thing for Smart-Hydrology of the Mantaro River Basin

Authors: Julio Jesus Salazar, Julio Jesus De Lama

Abstract:

the fundamental objective of hydrological studies applied to the engineering field is to determine the statistically consistent volumes or water flows that, in each case, allow us to size or design a series of elements or structures to effectively manage and develop a river basin. To determine these values, there are several ways of working within the framework of traditional hydrology: (1) Study each of the factors that influence the hydrological cycle, (2) Study the historical behavior of the hydrology of the area, (3) Study the historical behavior of hydrologically similar zones, and (4) Other studies (rain simulators or experimental basins). Of course, this range of studies in a certain basin is very varied and complex and presents the difficulty of collecting the data in real time. In this complex space, the study of variables can only be overcome by collecting and transmitting data to decision centers through the Internet of things and artificial intelligence. Thus, this research work implemented the learning project of the sub-basin of the Shullcas river in the Andean basin of the Mantaro river in Peru. The sensor firmware to collect and communicate hydrological parameter data was programmed and tested in similar basins of the European Union. The Machine Learning applications was programmed to choose the algorithms that direct the best solution to the determination of the rainfall-runoff relationship captured in the different polygons of the sub-basin. Tests were carried out in the mountains of Europe, and in the sub-basins of the Shullcas river (Huancayo) and the Yauli river (Jauja) with heights close to 5000 m.a.s.l., giving the following conclusions: to guarantee a correct communication, the distance between devices should not pass the 15 km. It is advisable to minimize the energy consumption of the devices and avoid collisions between packages, the distances oscillate between 5 and 10 km, in this way the transmission power can be reduced and a higher bitrate can be used. In case the communication elements of the devices of the network (internet of things) installed in the basin do not have good visibility between them, the distance should be reduced to the range of 1-3 km. The energy efficiency of the Atmel microcontrollers present in Arduino is not adequate to meet the requirements of system autonomy. To increase the autonomy of the system, it is recommended to use low consumption systems, such as the Ashton Raggatt McDougall or ARM Cortex L (Ultra Low Power) microcontrollers or even the Cortex M; and high-performance direct current (DC) to direct current (DC) converters. The Machine Learning System has initiated the learning of the Shullcas system to generate the best hydrology of the sub-basin. This will improve as machine learning and the data entered in the big data coincide every second. This will provide services to each of the applications of the complex system to return the best data of determined flows.

Keywords: hydrology, internet of things, machine learning, river basin

Procedia PDF Downloads 160
130 Estimating Poverty Levels from Satellite Imagery: A Comparison of Human Readers and an Artificial Intelligence Model

Authors: Ola Hall, Ibrahim Wahab, Thorsteinn Rognvaldsson, Mattias Ohlsson

Abstract:

The subfield of poverty and welfare estimation that applies machine learning tools and methods on satellite imagery is a nascent but rapidly growing one. This is in part driven by the sustainable development goal, whose overarching principle is that no region is left behind. Among other things, this requires that welfare levels can be accurately and rapidly estimated at different spatial scales and resolutions. Conventional tools of household surveys and interviews do not suffice in this regard. While they are useful for gaining a longitudinal understanding of the welfare levels of populations, they do not offer adequate spatial coverage for the accuracy that is needed, nor are their implementation sufficiently swift to gain an accurate insight into people and places. It is this void that satellite imagery fills. Previously, this was near-impossible to implement due to the sheer volume of data that needed processing. Recent advances in machine learning, especially the deep learning subtype, such as deep neural networks, have made this a rapidly growing area of scholarship. Despite their unprecedented levels of performance, such models lack transparency and explainability and thus have seen limited downstream applications as humans generally are apprehensive of techniques that are not inherently interpretable and trustworthy. While several studies have demonstrated the superhuman performance of AI models, none has directly compared the performance of such models and human readers in the domain of poverty studies. In the present study, we directly compare the performance of human readers and a DL model using different resolutions of satellite imagery to estimate the welfare levels of demographic and health survey clusters in Tanzania, using the wealth quintile ratings from the same survey as the ground truth data. The cluster-level imagery covers all 608 cluster locations, of which 428 were classified as rural. The imagery for the human readers was sourced from the Google Maps Platform at an ultra-high resolution of 0.6m per pixel at zoom level 18, while that of the machine learning model was sourced from the comparatively lower resolution Sentinel-2 10m per pixel data for the same cluster locations. Rank correlation coefficients of between 0.31 and 0.32 achieved by the human readers were much lower when compared to those attained by the machine learning model – 0.69-0.79. This superhuman performance by the model is even more significant given that it was trained on the relatively lower 10-meter resolution satellite data while the human readers estimated welfare levels from the higher 0.6m spatial resolution data from which key markers of poverty and slums – roofing and road quality – are discernible. It is important to note, however, that the human readers did not receive any training before ratings, and had this been done, their performance might have improved. The stellar performance of the model also comes with the inevitable shortfall relating to limited transparency and explainability. The findings have significant implications for attaining the objective of the current frontier of deep learning models in this domain of scholarship – eXplainable Artificial Intelligence through a collaborative rather than a comparative framework.

Keywords: poverty prediction, satellite imagery, human readers, machine learning, Tanzania

Procedia PDF Downloads 106
129 Convolutional Neural Network Based on Random Kernels for Analyzing Visual Imagery

Authors: Ja-Keoung Koo, Kensuke Nakamura, Hyohun Kim, Dongwha Shin, Yeonseok Kim, Ji-Su Ahn, Byung-Woo Hong

Abstract:

The machine learning techniques based on a convolutional neural network (CNN) have been actively developed and successfully applied to a variety of image analysis tasks including reconstruction, noise reduction, resolution enhancement, segmentation, motion estimation, object recognition. The classical visual information processing that ranges from low level tasks to high level ones has been widely developed in the deep learning framework. It is generally considered as a challenging problem to derive visual interpretation from high dimensional imagery data. A CNN is a class of feed-forward artificial neural network that usually consists of deep layers the connections of which are established by a series of non-linear operations. The CNN architecture is known to be shift invariant due to its shared weights and translation invariance characteristics. However, it is often computationally intractable to optimize the network in particular with a large number of convolution layers due to a large number of unknowns to be optimized with respect to the training set that is generally required to be large enough to effectively generalize the model under consideration. It is also necessary to limit the size of convolution kernels due to the computational expense despite of the recent development of effective parallel processing machinery, which leads to the use of the constantly small size of the convolution kernels throughout the deep CNN architecture. However, it is often desired to consider different scales in the analysis of visual features at different layers in the network. Thus, we propose a CNN model where different sizes of the convolution kernels are applied at each layer based on the random projection. We apply random filters with varying sizes and associate the filter responses with scalar weights that correspond to the standard deviation of the random filters. We are allowed to use large number of random filters with the cost of one scalar unknown for each filter. The computational cost in the back-propagation procedure does not increase with the larger size of the filters even though the additional computational cost is required in the computation of convolution in the feed-forward procedure. The use of random kernels with varying sizes allows to effectively analyze image features at multiple scales leading to a better generalization. The robustness and effectiveness of the proposed CNN based on random kernels are demonstrated by numerical experiments where the quantitative comparison of the well-known CNN architectures and our models that simply replace the convolution kernels with the random filters is performed. The experimental results indicate that our model achieves better performance with less number of unknown weights. The proposed algorithm has a high potential in the application of a variety of visual tasks based on the CNN framework. Acknowledgement—This work was supported by the MISP (Ministry of Science and ICT), Korea, under the National Program for Excellence in SW (20170001000011001) supervised by IITP, and NRF-2014R1A2A1A11051941, NRF2017R1A2B4006023.

Keywords: deep learning, convolutional neural network, random kernel, random projection, dimensionality reduction, object recognition

Procedia PDF Downloads 290
128 Theoretical and Experimental Investigation of Structural, Electrical and Photocatalytic Properties of K₀.₅Na₀.₅NbO₃ Lead- Free Ceramics Prepared via Different Synthesis Routes

Authors: Manish Saha, Manish Kumar Niranjan, Saket Asthana

Abstract:

The K₀.₅Na₀.₅NbO₃ (KNN) system has emerged as one of the most promising lead-free piezoelectric over the years. In this work, we perform a comprehensive investigation of electronic structure, lattice dynamics and dielectric/ferroelectric properties of the room temperature phase of KNN by combining ab-initio DFT-based theoretical analysis and experimental characterization. We assign the symmetry labels to KNN vibrational modes and obtain ab-initio polarized Raman spectra, Infrared (IR) reflectivity, Born-effective charge tensors, oscillator strengths etc. The computed Raman spectrum is found to agree well with the experimental spectrum. In particular, the results suggest that the mode in the range ~840-870 cm-¹ reported in the experimental studies is longitudinal optical (LO) with A_1 symmetry. The Raman mode intensities are calculated for different light polarization set-ups, which suggests the observation of different symmetry modes in different polarization set-ups. The electronic structure of KNN is investigated, and an optical absorption spectrum is obtained. Further, the performances of DFT semi-local, metal-GGA and hybrid exchange-correlations (XC) functionals, in the estimation of KNN band gaps are investigated. The KNN bandgap computed using GGA-1/2 and HSE06 hybrid functional schemes are found to be in excellant agreement with the experimental value. The COHP, electron localization function and Bader charge analysis is also performed to deduce the nature of chemical bonding in the KNN. The solid-state reaction and hydrothermal methods are used to prepare the KNN ceramics, and the effects of grain size on the physical characteristics these ceramics are examined. A comprehensive study on the impact of different synthesis techniques on the structural, electrical, and photocatalytic properties of ferroelectric ceramics KNN. The KNN-S prepared by solid-state method have significantly larger grain size as compared to that for KNN-H prepared by hydrothermal method. Furthermore, the KNN-S is found to exhibit higher dielectric, piezoelectric and ferroelectric properties as compared to KNN-H. On the other hand, the increased photocatalytic activity is observed in KNN-H as compared to KNN-S. As compared to the hydrothermal synthesis, the solid-state synthesis causes an increase in the relative dielectric permittivity (ε^') from 2394 to 3286, remnant polarization (P_r) from 15.38 to 20.41 μC/cm^², planer electromechanical coupling factor (k_p) from 0.19 to 0.28 and piezoelectric coefficient (d_33) from 88 to 125 pC/N. The KNN-S ceramics are also found to have a lower leakage current density, and higher grain resistance than KNN-H ceramic. The enhanced photocatalytic activity of KNN-H is attributed to relatively smaller particle sizes. The KNN-S and KNN-H samples are found to have degradation efficiencies of RhB solution of 20% and 65%, respectively. The experimental study highlights the importance of synthesis methods and how these can be exploited to tailor the dielectric, piezoelectric and photocatalytic properties of KNN. Overall, our study provides several bench-mark important results on KNN that have not been reported so far.

Keywords: lead-free piezoelectric, Raman intensity spectrum, electronic structure, first-principles calculations, solid state synthesis, photocatalysis, hydrothermal synthesis

Procedia PDF Downloads 49
127 Sensor and Sensor System Design, Selection and Data Fusion Using Non-Deterministic Multi-Attribute Tradespace Exploration

Authors: Matthew Yeager, Christopher Willy, John Bischoff

Abstract:

The conceptualization and design phases of a system lifecycle consume a significant amount of the lifecycle budget in the form of direct tasking and capital, as well as the implicit costs associated with unforeseeable design errors that are only realized during downstream phases. Ad hoc or iterative approaches to generating system requirements oftentimes fail to consider the full array of feasible systems or product designs for a variety of reasons, including, but not limited to: initial conceptualization that oftentimes incorporates a priori or legacy features; the inability to capture, communicate and accommodate stakeholder preferences; inadequate technical designs and/or feasibility studies; and locally-, but not globally-, optimized subsystems and components. These design pitfalls can beget unanticipated developmental or system alterations with added costs, risks and support activities, heightening the risk for suboptimal system performance, premature obsolescence or forgone development. Supported by rapid advances in learning algorithms and hardware technology, sensors and sensor systems have become commonplace in both commercial and industrial products. The evolving array of hardware components (i.e. sensors, CPUs, modular / auxiliary access, etc…) as well as recognition, data fusion and communication protocols have all become increasingly complex and critical for design engineers during both concpetualization and implementation. This work seeks to develop and utilize a non-deterministic approach for sensor system design within the multi-attribute tradespace exploration (MATE) paradigm, a technique that incorporates decision theory into model-based techniques in order to explore complex design environments and discover better system designs. Developed to address the inherent design constraints in complex aerospace systems, MATE techniques enable project engineers to examine all viable system designs, assess attribute utility and system performance, and better align with stakeholder requirements. Whereas such previous work has been focused on aerospace systems and conducted in a deterministic fashion, this study addresses a wider array of system design elements by incorporating both traditional tradespace elements (e.g. hardware components) as well as popular multi-sensor data fusion models and techniques. Furthermore, statistical performance features to this model-based MATE approach will enable non-deterministic techniques for various commercial systems that range in application, complexity and system behavior, demonstrating a significant utility within the realm of formal systems decision-making.

Keywords: multi-attribute tradespace exploration, data fusion, sensors, systems engineering, system design

Procedia PDF Downloads 183
126 Dietary Exposure Assessment of Potentially Toxic Trace Elements in Fruits and Vegetables Grown in Akhtala, Armenia

Authors: Davit Pipoyan, Meline Beglaryan, Nicolò Merendino

Abstract:

Mining industry is one of the priority sectors of Armenian economy. Along with the solution of some socio-economic development, it brings about numerous environmental problems, especially toxic element pollution, which largely influences the safety of agricultural products. In addition, accumulation of toxic elements in agricultural products, mainly in edible parts of plants represents a direct pathway for their penetration into the human food chain. In Armenia, the share of plant origin food in overall diet is significantly high, so estimation of dietary intakes of toxic trace elements via consumption of selected fruits and vegetables are of great importance for observing the underlying health risks. Therefore, the present study was aimed to assess dietary exposure of potentially toxic trace elements through the intake of locally grown fruits and vegetables in Akhtala community (Armenia), where not only mining industry is developed, but also cultivation of fruits and vegetables. Moreover, this investigation represents one of the very first attempts to estimate human dietary exposure of potentially toxic trace elements in the study area. Samples of some commonly grown fruits and vegetables (fig, cornel, raspberry, grape, apple, plum, maize, bean, potato, cucumber, onion, greens) were randomly collected from several home gardens located near mining areas in Akhtala community. The concentration of Cu, Mo, Ni, Cr, Pb, Zn, Hg, As and Cd in samples were determined by using an atomic absorption spectrophotometer (AAS). Precision and accuracy of analyses were guaranteed by repeated analysis of samples against NIST Standard Reference Materials. For a diet study, individual-based approach was used, so the consumption of selected fruits and vegetables was investigated through food frequency questionnaire (FFQ). Combining concentration data with contamination data, the estimated daily intakes (EDI) and cumulative daily intakes were assessed and compared with health-based guidance values (HBGVs). According to the determined concentrations of the studied trace elements in fruits and vegetables, it can be stressed that some trace elements (Cu, Ni, Pb, Zn) among the majority of samples exceeded maximum allowable limits set by international organizations. Meanwhile, others (Cr, Hg, As, Cd, Mo) either did not exceed these limits or still do not have established allowable limits. The obtained results indicated that only for Cu the EDI values exceeded dietary reference intake (0.01 mg/kg/Bw/day) for some investigated fruits and vegetables in decreasing order of potato > grape > bean > raspberry > fig > greens. In contrast to this, for combined consumption of selected fruits and vegetables estimated cumulative daily intakes exceeded reference doses in the following sequence: Zn > Cu > Ni > Mo > Pb. It may be concluded that habitual and combined consumption of the above mentioned fruits and vegetables can pose a health risk to the local population. Hence, further detailed studies are needed for the overall assessment of potential health implications taking into consideration adverse health effects posed by more than one toxic trace element.

Keywords: daily intake, dietary exposure, fruits, trace elements, vegetables

Procedia PDF Downloads 301
125 A Study on Aquatic Bycatch Mortality Estimation Due to Prawn Seed Collection and Alteration of Collection Method through Sustainable Practices in Selected Areas of Sundarban Biosphere Reserve (SBR), India

Authors: Samrat Paul, Satyajit Pahari, Krishnendu Basak, Amitava Roy

Abstract:

Fishing is one of the pivotal livelihood activities, especially in developing countries. Today it is considered an important occupation for human society from the era of human settlement began. In simple terms, non-target catches of any species during fishing can be considered as ‘bycatch,’ and fishing bycatch is neither a new fishery management issue nor a new problem. Sundarban is one of the world’s largest mangrove land expanding up to 10,200 sq. km in India and Bangladesh. This largest mangrove biome resource is used by the local inhabitants commercially to run their livelihood, especially by forest fringe villagers (FFVs). In Sundarban, over-fishing, especially post larvae collection of wild Penaeus monodon, is one of the major concerns, as during the collection of P. monodon, different aquatic species are destroyed as a result of bycatch mortality which changes in productivity and may negatively impact entire biodiversity, of the ecosystem. Wild prawn seed collection gear like a small mesh sized net poses a serious threat to aquatic stocks, where the collection isn’t only limited to prawn seed larvae. As prawn seed collection processes are inexpensive, require less monetary investment, and are lucrative; people are easily engaged here as their source of income. Wildlife Trust of India’s (WTI) intervention in selected forest fringe villages of Sundarban Tiger Reserve (STR) was to estimate and reduce the mortality of aquatic bycatches by involving local communities in newly developed release method and their time engagement in prawn seed collection (PSC) by involving them in Alternate Income Generation (AIG). The study was conducted for their taxonomic identification during the period of March to October 2019. Collected samples were preserved in 70% ethyl alcohol for identification, and all the preserved bycatch samples were identified morphologically by the expertise of the Zoological Survey of India (ZSI), Kolkata. Around 74 different aquatic species, where 11 different species are molluscs, 41 fish species, out of which 31 species were identified, and 22 species of crustacean collected, out of which 18 species were identified. Around 13 different species belong to a different order, and families were unable to identify them morphologically as they were collected in the juvenile stage. The study reveals that for collecting one single prawn seed, eight individual life of associated faunas are being lost. Zero bycatch mortality is not practical; rather, collectors should focus on bycatch reduction by avoiding capturing, allowing escaping, and mortality reduction, and must make changes in their fishing method by increasing net mesh size, which will avoid non-target captures. But as the prawns are small in size (generally 1-1.5 inches in length), thus increase net size making economically less or no profit for collectors if they do so. In this case, returning bycatches is considered one of the best ways to a reduction in bycatch mortality which is a more sustainable practice.

Keywords: bycatch mortality, biodiversity, mangrove biome resource, sustainable practice, Alternate Income Generation (AIG)

Procedia PDF Downloads 152
124 Using Machine Learning to Extract Patient Data from Non-standardized Sports Medicine Physician Notes

Authors: Thomas Q. Pan, Anika Basu, Chamith S. Rajapakse

Abstract:

Machine learning requires data that is categorized into features that models train on. This topic is important to the field of sports medicine due to the many tools it provides to physicians such as diagnosis support and risk assessment. Physician note that healthcare professionals take are usually unclean and not suitable for model training. The objective of this study was to develop and evaluate an advanced approach for extracting key features from sports medicine data without the need for extensive model training or data labeling. An LLM (Large Language Model) was given a narrative (Physician’s Notes) and prompted to extract four features (details about the patient). The narrative was found in a datasheet that contained six columns: Case Number, Validation Age, Validation Gender, Validation Diagnosis, Validation Body Part, and Narrative. The validation columns represent the accurate responses that the LLM attempts to output. With the given narrative, the LLM would output its response and extract the age, gender, diagnosis, and injured body part with each category taking up one line. The output would then be cleaned, matched, and added to new columns containing the extracted responses. Five ways of checking the accuracy were used: unclear count, substring comparison, LLM comparison, LLM re-check, and hand-evaluation. The unclear count essentially represented the extractions the LLM missed. This can be also understood as the recall score ([total - false negatives] over total). The rest of these correspond to the precision score ([total - false positives] over total). Substring comparison evaluated the validation (X) and extracted (Y) columns’ likeness by checking if X’s results were a substring of Y's findings and vice versa. LLM comparison directly asked an LLM if the X and Y’s results were similar. LLM Re-check prompted the LLM to see if the extracted results can be found in the narrative. Lastly, A selection of 1,000 random narratives was also selected and hand-evaluated to give an estimate of how well the LLM-based feature extraction model performed. With a selection of 10,000 narratives, the LLM-based approach had a recall score of roughly 98%. However, the precision scores of the substring comparison and LLM comparison models were around 72% and 76% respectively. The reason for these low figures is due to the minute differences between answers. For example, the ‘chest’ is a part of the ‘upper trunk’ however, these models cannot detect that. On the other hand, the LLM re-check and subset of hand-tested narratives showed a precision score of 96% and 95%. If this subset is used to extrapolate the possible outcome of the whole 10,000 narratives, the LLM-based approach would be strong in both precision and recall. These results indicated that an LLM-based feature extraction model could be a useful way for medical data in sports to be collected and analyzed by machine learning models. Wide use of this method could potentially increase the availability of data thus improving machine learning algorithms and supporting doctors with more enhanced tools.

Keywords: AI, LLM, ML, sports

Procedia PDF Downloads 6
123 Design of a Small and Medium Enterprise Growth Prediction Model Based on Web Mining

Authors: Yiea Funk Te, Daniel Mueller, Irena Pletikosa Cvijikj

Abstract:

Small and medium enterprises (SMEs) play an important role in the economy of many countries. When the overall world economy is considered, SMEs represent 95% of all businesses in the world, accounting for 66% of the total employment. Existing studies show that the current business environment is characterized as highly turbulent and strongly influenced by modern information and communication technologies, thus forcing SMEs to experience more severe challenges in maintaining their existence and expanding their business. To support SMEs at improving their competitiveness, researchers recently turned their focus on applying data mining techniques to build risk and growth prediction models. However, data used to assess risk and growth indicators is primarily obtained via questionnaires, which is very laborious and time-consuming, or is provided by financial institutes, thus highly sensitive to privacy issues. Recently, web mining (WM) has emerged as a new approach towards obtaining valuable insights in the business world. WM enables automatic and large scale collection and analysis of potentially valuable data from various online platforms, including companies’ websites. While WM methods have been frequently studied to anticipate growth of sales volume for e-commerce platforms, their application for assessment of SME risk and growth indicators is still scarce. Considering that a vast proportion of SMEs own a website, WM bears a great potential in revealing valuable information hidden in SME websites, which can further be used to understand SME risk and growth indicators, as well as to enhance current SME risk and growth prediction models. This study aims at developing an automated system to collect business-relevant data from the Web and predict future growth trends of SMEs by means of WM and data mining techniques. The envisioned system should serve as an 'early recognition system' for future growth opportunities. In an initial step, we examine how structured and semi-structured Web data in governmental or SME websites can be used to explain the success of SMEs. WM methods are applied to extract Web data in a form of additional input features for the growth prediction model. The data on SMEs provided by a large Swiss insurance company is used as ground truth data (i.e. growth-labeled data) to train the growth prediction model. Different machine learning classification algorithms such as the Support Vector Machine, Random Forest and Artificial Neural Network are applied and compared, with the goal to optimize the prediction performance. The results are compared to those from previous studies, in order to assess the contribution of growth indicators retrieved from the Web for increasing the predictive power of the model.

Keywords: data mining, SME growth, success factors, web mining

Procedia PDF Downloads 267