Search results for: content classification
Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 7962

Search results for: content classification

7092 The Power of Earned Media: Exploring the Key Success of Love Destiny, Thai Smash Hit Television Drama

Authors: Wilaiwan Jongwilaikasaem, Phatteera Sarakornborrirak

Abstract:

While Thai television producers feel anxious about digital disruption, Love Destiny, Thai television period drama became smash hit in Thailand in 2018. Audience throughout the country not only watched the drama both offline and online but also spread the content of the drama on social media and followed cultural trends from the protagonist. Thus, the main purpose of this article is to examine the secret behind the success of Love Destiny. Data were collected from content analysis and in-depth interview. The result shows that the key success of the drama is from earned media phenomenon from the audience and marketers’ engagement. As Love Destiny has full-flavored content with traditional challenged plot, delicate production, and presentation of Thainess in a positive and tangible way; audience and marketers are enthusiastic about building up the popular trend of Love Destiny on social media and also coming back home to watch televisions when the drama was on the air.

Keywords: Thai drama, earned media, Love Destiny, television

Procedia PDF Downloads 168
7091 Graph Neural Network-Based Classification for Disease Prediction in Health Care Heterogeneous Data Structures of Electronic Health Record

Authors: Raghavi C. Janaswamy

Abstract:

In the healthcare sector, heterogenous data elements such as patients, diagnosis, symptoms, conditions, observation text from physician notes, and prescriptions form the essentials of the Electronic Health Record (EHR). The data in the form of clear text and images are stored or processed in a relational format in most systems. However, the intrinsic structure restrictions and complex joins of relational databases limit the widespread utility. In this regard, the design and development of realistic mapping and deep connections as real-time objects offer unparallel advantages. Herein, a graph neural network-based classification of EHR data has been developed. The patient conditions have been predicted as a node classification task using a graph-based open source EHR data, Synthea Database, stored in Tigergraph. The Synthea DB dataset is leveraged due to its closer representation of the real-time data and being voluminous. The graph model is built from the EHR heterogeneous data using python modules, namely, pyTigerGraph to get nodes and edges from the Tigergraph database, PyTorch to tensorize the nodes and edges, PyTorch-Geometric (PyG) to train the Graph Neural Network (GNN) and adopt the self-supervised learning techniques with the AutoEncoders to generate the node embeddings and eventually perform the node classifications using the node embeddings. The model predicts patient conditions ranging from common to rare situations. The outcome is deemed to open up opportunities for data querying toward better predictions and accuracy.

Keywords: electronic health record, graph neural network, heterogeneous data, prediction

Procedia PDF Downloads 81
7090 Towards Real-Time Classification of Finger Movement Direction Using Encephalography Independent Components

Authors: Mohamed Mounir Tellache, Hiroyuki Kambara, Yasuharu Koike, Makoto Miyakoshi, Natsue Yoshimura

Abstract:

This study explores the practicality of using electroencephalographic (EEG) independent components to predict eight-direction finger movements in pseudo-real-time. Six healthy participants with individual-head MRI images performed finger movements in eight directions with two different arm configurations. The analysis was performed in two stages. The first stage consisted of using independent component analysis (ICA) to separate the signals representing brain activity from non-brain activity signals and to obtain the unmixing matrix. The resulting independent components (ICs) were checked, and those reflecting brain-activity were selected. Finally, the time series of the selected ICs were used to predict eight finger-movement directions using Sparse Logistic Regression (SLR). The second stage consisted of using the previously obtained unmixing matrix, the selected ICs, and the model obtained by applying SLR to classify a different EEG dataset. This method was applied to two different settings, namely the single-participant level and the group-level. For the single-participant level, the EEG dataset used in the first stage and the EEG dataset used in the second stage originated from the same participant. For the group-level, the EEG datasets used in the first stage were constructed by temporally concatenating each combination without repetition of the EEG datasets of five participants out of six, whereas the EEG dataset used in the second stage originated from the remaining participants. The average test classification results across datasets (mean ± S.D.) were 38.62 ± 8.36% for the single-participant, which was significantly higher than the chance level (12.50 ± 0.01%), and 27.26 ± 4.39% for the group-level which was also significantly higher than the chance level (12.49% ± 0.01%). The classification accuracy within [–45°, 45°] of the true direction is 70.03 ± 8.14% for single-participant and 62.63 ± 6.07% for group-level which may be promising for some real-life applications. Clustering and contribution analyses further revealed the brain regions involved in finger movement and the temporal aspect of their contribution to the classification. These results showed the possibility of using the ICA-based method in combination with other methods to build a real-time system to control prostheses.

Keywords: brain-computer interface, electroencephalography, finger motion decoding, independent component analysis, pseudo real-time motion decoding

Procedia PDF Downloads 132
7089 Content Monetization as a Mark of Media Economy Quality

Authors: Bela Lebedeva

Abstract:

Characteristics of the Web as a channel of information dissemination - accessibility and openness, interactivity and multimedia news - become wider and cover the audience quickly, positively affecting the perception of content, but blur out the understanding of the journalistic work. As a result audience and advertisers continue migrating to the Internet. Moreover, online targeting allows monetizing not only the audience (as customarily given to traditional media) but also the content and traffic more accurately. While the users identify themselves with the qualitative characteristics of the new market, its actors are formed. Conflict of interests is laid in the base of the economy of their relations, the problem of traffic tax as an example. Meanwhile, content monetization actualizes fiscal interest of the state too. The balance of supply and demand is often violated due to the political risks, particularly in terms of state capitalism, populism and authoritarian methods of governance such social institutions as the media. A unique example of access to journalistic material, limited by monetization of content is a television channel Dozhd' (Rain) in Russian web space. Its liberal-minded audience has a better possibility for discussion. However, the channel could have been much more successful in terms of unlimited free speech. Avoiding state pressure and censorship its management has decided to save at least online performance and monetizing all of the content for the core audience. The study Methodology was primarily based on the analysis of journalistic content, on the qualitative and quantitative analysis of the audience. Reconstructing main events and relationships of actors on the market for the last six years researcher has reached some conclusions. First, under the condition of content monetization the capitalization of its quality will always strive to quality characteristics of user, thereby identifying him. Vice versa, the user's demand generates high-quality journalism. The second conclusion follows the previous one. The growth of technology, information noise, new political challenges, the economy volatility and the cultural paradigm change – all these factors form the content paying model for an individual user. This model defines him as a beneficiary of specific knowledge and indicates the constant balance of supply and demand other conditions being equal. As a result, a new economic quality of information is created. This feature is an indicator of the market as a self-regulated system. Monetized information quality is less popular than that of the Public Broadcasting Service, but this audience is able to make decisions. These very users keep the niche sectors which have more potential of technology development, including the content monetization ways. The third point of the study allows develop it in the discourse of media space liberalization. This cultural phenomenon may open opportunities for the development of social and economic relations architecture both locally and regionally.

Keywords: content monetization, state capitalism, media liberalization, media economy, information quality

Procedia PDF Downloads 234
7088 An Application to Predict the Best Study Path for Information Technology Students in Learning Institutes

Authors: L. S. Chathurika

Abstract:

Early prediction of student performance is an important factor to be gained academic excellence. Whatever the study stream in secondary education, students lay the foundation for higher studies during the first year of their degree or diploma program in Sri Lanka. The information technology (IT) field has certain improvements in the education domain by selecting specialization areas to show the talents and skills of students. These specializations can be software engineering, network administration, database administration, multimedia design, etc. After completing the first-year, students attempt to select the best path by considering numerous factors. The purpose of this experiment is to predict the best study path using machine learning algorithms. Five classification algorithms: decision tree, support vector machine, artificial neural network, Naïve Bayes, and logistic regression are selected and tested. The support vector machine obtained the highest accuracy, 82.4%. Then affecting features are recognized to select the best study path.

Keywords: algorithm, classification, evaluation, features, testing, training

Procedia PDF Downloads 111
7087 Analysis, Evaluation and Optimization of Food Management: Minimization of Food Losses and Food Wastage along the Food Value Chain

Authors: G. Hafner

Abstract:

A method developed at the University of Stuttgart will be presented: ‘Analysis, Evaluation and Optimization of Food Management’. A major focus is represented by quantification of food losses and food waste as well as their classification and evaluation regarding a system optimization through waste prevention. For quantification and accounting of food, food losses and food waste along the food chain, a clear definition of core terms is required at the beginning. This includes their methodological classification and demarcation within sectors of the food value chain. The food chain is divided into agriculture, industry and crafts, trade and consumption (at home and out of home). For adjustment of core terms, the authors have cooperated with relevant stakeholders in Germany for achieving the goal of holistic and agreed definitions for the whole food chain. This includes modeling of sub systems within the food value chain, definition of terms, differentiation between food losses and food wastage as well as methodological approaches. ‘Food Losses’ and ‘Food Wastes’ are assigned to individual sectors of the food chain including a description of the respective methods. The method for analyzing, evaluation and optimization of food management systems consist of the following parts: Part I: Terms and Definitions. Part II: System Modeling. Part III: Procedure for Data Collection and Accounting Part. IV: Methodological Approaches for Classification and Evaluation of Results. Part V: Evaluation Parameters and Benchmarks. Part VI: Measures for Optimization. Part VII: Monitoring of Success The method will be demonstrated at the example of an invesigation of food losses and food wastage in the Federal State of Bavaria including an extrapolation of respective results to quantify food wastage in Germany.

Keywords: food losses, food waste, resource management, waste management, system analysis, waste minimization, resource efficiency

Procedia PDF Downloads 398
7086 Using ICESat-2 Dynamic Ocean Topography to Estimate Western Arctic Freshwater Content

Authors: Joshua Adan Valdez, Shawn Gallaher

Abstract:

Global climate change has impacted atmospheric temperatures contributing to rising sea levels, decreasing sea ice, and increased freshening of high latitude oceans. This freshening has contributed to increased stratification inhibiting local mixing and nutrient transport, modifying regional circulations in polar oceans. In recent years, the Western Arctic has seen an increase in freshwater volume at an average rate of 397+-116km3/year across the Beaufort Gyre. The majority of the freshwater volume resides in the Beaufort Gyre surface lens driven by anticyclonic wind forcing, sea ice melt, and Arctic river runoff, and is typically defined as water fresher than 34.8. The near-isothermal nature of Arctic seawater and non-linearities in the equation of state for near-freezing waters result in a salinity-driven pycnocline as opposed to the temperature-driven density structure seen in the lower latitudes. In this study, we investigate the relationship between freshwater content and dynamic ocean topography (DOT). In situ measurements of freshwater content are useful in providing information on the freshening rate of the Beaufort Gyre; however, their collection is costly and time-consuming. Utilizing NASA’s ICESat-2’s DOT remote sensing capabilities and Air Expendable CTD (AXCTD) data from the Seasonal Ice Zone Reconnaissance Surveys (SIZRS), a linear regression model between DOT and freshwater content is determined along the 150° west meridian. Freshwater content is calculated by integrating the volume of water between the surface and a depth with a reference salinity of ~34.8. Using this model, we compare interannual variability in freshwater content within the gyre, which could provide a future predictive capability of freshwater volume changes in the Beaufort-Chukchi Sea using non-in situ methods. Successful employment of the ICESat-2’s DOT approximation of freshwater content could potentially demonstrate the value of remote sensing tools to reduce reliance on field deployment platforms to characterize physical ocean properties.

Keywords: Cryosphere, remote sensing, Arctic oceanography, climate modeling, Ekman transport

Procedia PDF Downloads 68
7085 Use of High Hydrostatic Pressure as an Alternative Preservation Method in Camels Milk

Authors: Fahad Aljasass, Hamza Abu-Tarboush, Salah Aleid, Siddig Hamad

Abstract:

The effects of different high hydrostatic pressure treatments on the shelf life of camel’s milk were studied. Treatments at 300 to 350 MPa for 5 minutes at 40°C reduced microbial contamination to levels that prolonged the shelf life of refrigerated (3° C) milk up to 28 days. The treatment resulted in a decrease in the proteolytic activity of the milk. The content of proteolytic enzymes in the untreated milk sample was 4.23 µM/ml. This content decreased significantly to 3.61 µM/ml when the sample was treated at 250 MPa. Treatment at 300 MPa decreased the content to 3.90 which was not significantly different from the content of the untreated sample. The content of the sample treated at 350 MPa dropped to 2.98 µM/ml which was significantly lower than the contents of all other treated and untreated samples. High pressure treatment caused a slight but statistically significant increase in the pH of camel’s milk. The pH of the untreated sample was 6.63, which increased significantly to 6.70, in the samples treated at 250 and 350 MPa, but insignificantly in the sample treated at 300 MPa. High pressure treatment resulted in some degree of milk fat oxidation. The thiobarbituric acid (TBA) value of the untreated sample was 0.86 mg malonaldehyde/kg milk. This value remained unchanged in the sample treated at 250 MPa, but then it increased significantly to 1.25 and 1.33 mg/kg in the samples treated at 300 and 350 MPa, respectively. High pressure treatment caused a small increase in the greenness (a* value) of camel’s milk. The value of a* was reduced from -1.17 for the untreated sample to -1.26, -1.21 and -1.30 for the samples treated at 250, 300 and 350 MPa, respectively. Δa* at the 250 MPa treatment was -0.09, which then decreased to -0.04 at the 300 MPa treatment to increase again to -0.13 at the 350 MPa treatment. The yellowness (b* value) of camel’s milk increased significantly as a result of high pressure treatment. The b* value of the untreated sample was 1.40, this value increased to 2.73, 2.31 and 2.18 after treatments at 250, 300 and 350 MPa, respectively. The Δb* value was +1.33 at the treatment 250 MPa, decreased to +0.91 at 300 MPa and further to +0.78 at 350 MPa. The pressure treatment caused slight effect on color, slight decrease in protease activity and a slight increase in the oxidation products of lipids.

Keywords: high hydrostatic pressure, camel’s milk, mesophilic aerobic bacteria, clotting, protease

Procedia PDF Downloads 255
7084 Issues in Translating Hadith Terminologies into English: A Critical Approach

Authors: Mohammed Riyas Pp

Abstract:

This study aimed at investigating major issues in translating the Arabic Hadith terminologies into English, focusing on choosing the most appropriate translation for each, reviewing major Hadith works in English. This study is confined to twenty terminologies with regard to classification of Hadith based on authority, strength, number of transmitters and connections in Isnad. Almost all available translations are collected and analyzed to find the most proper translation based on linguistic and translational values. To the researcher, many translations lack precise understanding of either Hadith terminologies or English language and varieties of methodologies have influence on varieties of translations. This study provides a classification of translational and conceptual issues. Translational issues are related to translatability of these terminologies and their equivalence. Conceptual issues provide a list of misunderstandings due to wrong translations of terminologies. This study ends with a suggestion for unification in translating terminologies based on convention of Muslim scholars having good understanding of Hadith terminologies and English language.

Keywords: english language, hadith terminologies, equivalence in translation, problems in translation

Procedia PDF Downloads 179
7083 Protein Quality of Game Meat Hunted in Latvia

Authors: Vita Strazdina, Aleksandrs Jemeljanovs, Vita Sterna

Abstract:

Not all proteins have the same nutritional value, since protein quality strongly depends on its amino acid composition and digestibility. The meat of game animals could be a high protein source because of its well-balanced essential amino acids composition. Investigations about biochemical composition of game meat such as wild boar (Sus scrofa scrofa), roe deer (Capreolus capreolus) and beaver (Castor fiber) are not very much. Therefore, the aim of the investigation was evaluate protein composition of game meat hunted in Latvia. The biochemical analysis, evaluation of connective tissue and essential amino acids in meat samples were done, the amino acids score were calculate. Results of analysis showed that protein content 20.88-22.05% of all types of meat samples is not different statistically. The content of connective tissue from 1.3% in roe deer till 1.5% in beaver meat allowed classified game animal as high quality meat. The sum of essential amino acids in game meat samples were determined 7.05–8.26g100g-1. Roe deer meat has highest protein content and lowest content of connective tissues among game meat hunted in Latvia. Concluded that amino acid score for limiting amino acids phenylalanine and tyrosine is high and shows high biological value of game meat.

Keywords: dietic product, game meat, amino acids, scores

Procedia PDF Downloads 310
7082 The Practice of Teaching Chemistry by the Application of Online Tests

Authors: Nikolina Ribarić

Abstract:

E-learning is most commonly defined as a set of applications and processes, such as Web-based learning, computer-based learning, virtual classrooms, and digital collaboration, that enable access to instructional content through a variety of electronic media. The main goal of an e-learning system is learning, and the way to evaluate the impact of an e-learning system is by examining whether students learn effectively with the help of that system. Testmoz is a program for online preparation of knowledge evaluation assignments. The program provides teachers with computer support during the design of assignments and evaluating them. Students can review and solve assignments and also check the correctness of their solutions. Research into the increase of motivation by the practice of providing teaching content by applying online tests prepared in the Testmoz program was carried out with students of the 8th grade of Ljubo Babić Primary School in Jastrebarsko. The students took the tests in their free time, from home, for an unlimited number of times. SPSS was used to process the data obtained by the research instruments. The results of the research showed that students preferred to practice teaching content and achieved better educational results in chemistry when they had access to online tests for repetition and practicing in relation to subject content which was checked after repetition and practicing in "the classical way" -i.e., solving assignments in a workbook or writing assignments in worksheets.

Keywords: chemistry class, e-learning, motivation, Testmoz

Procedia PDF Downloads 151
7081 Nutrition of Preschool Children in the Aspect of Nutritional Status

Authors: Klaudia Tomala, Elzbieta Grochowska-Niedworok, Katarzyna Brukalo, Marek Kardas, Beata Calyniuk, Renata Polaniak

Abstract:

Background. Nutrition plays an important role in the psychophysical growth of children and has effects on their health. Providing children with the appropriate supply of macro- and micro-nutrients requires dietary diversity across every food group. Meals in kindergartens should provide 70-75% of their daily food requirement. Aim. The aim of this study was to determine the vitamin content in the food rations of children attending kindergarten in the wider aspect of nutritional status. Material and Methods. Kindergarten menus from the spring and autumn seasons of 2015 were analyzed. In these meals, fat content and levels of water-soluble vitamins were estimated. The vitamin content was evaluated using the diet calculator “Aliant”. Statistical analysis was done in MS Office Excel 2007. Results. Vitamin content in the analyzed menus in many cases is too high with reference to dietary intake, with only vitamin D intake being insufficient. Vitamin E intake was closest to the dietary reference intake. Conclusion. The results show that vitamin intake is usually too high, and menus should, therefore, be modified. Also, nutrition education among kindergarten staff is needed. The identified errors in the composition of meals will affect the nutritional status of children and their proper composition in the body.

Keywords: children, nutrition status, vitamins, preschool

Procedia PDF Downloads 151
7080 Investigation in Gassy Ozone Influence on Flaxes Made from Biologically Activated Whole Wheat Grains Quality Parameters

Authors: Tatjana Rakcejeva, Jelena Zagorska, Elina Zvezdina

Abstract:

The aim of the current research was to investigate the gassy ozone effect on quality parameters of flaxes made form whole biologically activated wheat grains. The research was accomplished on in year 2012 harvested wheat grains variety ′Zentos′. Grains were washed, wetted; grain biological activation was performed in the climatic chamber up to 24 hours. After biological activation grains was compressed; than flaxes was dried in convective drier till constant moisture content 9±1%. For grain treatment gassy ozone concentration as 0.0002% and treatment time – 6 min was used. In the processed flaxes the content of A and G tocopherol decrease by 23% and by 9%; content of B2 and B6 vitamins – by 11% and by 10%; elaidic acid – by 46%, oleic acid – by 29%; arginine (by 80%), glutamine (by 74%), asparagine and serine (by 68%), valine (by 62%), cysteine (by 54%) and tyrosine (by 47%).

Keywords: gassy ozone, flaxes, biologically activated grains, quality parameters, treatment

Procedia PDF Downloads 231
7079 Monitoring of Pesticide Content in Biscuits Available on the Vojvodina Market, Serbia

Authors: Ivana Loncarevic, Biljana Pajin, Ivana Vasiljevic, Milana Lazovic, Danica Mrkajic, Aleksandar Fises, Strahinja Kovacevic

Abstract:

Biscuits belong to a group of flour-confectionery products that are considerably consumed worldwide. The basic raw material for their production is wheat flour or integral flour as a nutritionally highly valuable component. However, this raw material is also a potential source of contamination since it may contain the residues of biochemical compounds originating from plant and soil protection agents. Therefore, it is necessary to examine the health safety of both raw materials and final products. The aim of this research was to examine the content of undesirable residues of pesticides (mostly organochlorine pesticides, organophosphorus pesticides, carbamate pesticides, triazine pesticides, and pyrethroid pesticides) in 30 different biscuit samples of domestic origin present on the Vojvodina market using Gas Chromatograph Thermo ISQ/Trace 1300. The results showed that all tested samples had the limit of detection of pesticide content below 0.01 mg/kg, indicating that this type of confectionary products is not contaminated with pesticides.

Keywords: biscuits, pesticides, contamination, quality

Procedia PDF Downloads 170
7078 Diversity in Finance Literature Revealed through the Lens of Machine Learning: A Topic Modeling Approach on Academic Papers

Authors: Oumaima Lahmar

Abstract:

This paper aims to define a structured topography for finance researchers seeking to navigate the body of knowledge in their extrapolation of finance phenomena. To make sense of the body of knowledge in finance, a probabilistic topic modeling approach is applied on 6000 abstracts of academic articles published in three top journals in finance between 1976 and 2020. This approach combines both machine learning techniques and natural language processing to statistically identify the conjunctions between research articles and their shared topics described each by relevant keywords. The topic modeling analysis reveals 35 coherent topics that can well depict finance literature and provide a comprehensive structure for the ongoing research themes. Comparing the extracted topics to the Journal of Economic Literature (JEL) classification system, a significant similarity was highlighted between the characterizing keywords. On the other hand, we identify other topics that do not match the JEL classification despite being relevant in the finance literature.

Keywords: finance literature, textual analysis, topic modeling, perplexity

Procedia PDF Downloads 159
7077 A Framework for Auditing Multilevel Models Using Explainability Methods

Authors: Debarati Bhaumik, Diptish Dey

Abstract:

Multilevel models, increasingly deployed in industries such as insurance, food production, and entertainment within functions such as marketing and supply chain management, need to be transparent and ethical. Applications usually result in binary classification within groups or hierarchies based on a set of input features. Using open-source datasets, we demonstrate that popular explainability methods, such as SHAP and LIME, consistently underperform inaccuracy when interpreting these models. They fail to predict the order of feature importance, the magnitudes, and occasionally even the nature of the feature contribution (negative versus positive contribution to the outcome). Besides accuracy, the computational intractability of SHAP for binomial classification is a cause of concern. For transparent and ethical applications of these hierarchical statistical models, sound audit frameworks need to be developed. In this paper, we propose an audit framework for technical assessment of multilevel regression models focusing on three aspects: (i) model assumptions & statistical properties, (ii) model transparency using different explainability methods, and (iii) discrimination assessment. To this end, we undertake a quantitative approach and compare intrinsic model methods with SHAP and LIME. The framework comprises a shortlist of KPIs, such as PoCE (Percentage of Correct Explanations) and MDG (Mean Discriminatory Gap) per feature, for each of these three aspects. A traffic light risk assessment method is furthermore coupled to these KPIs. The audit framework will assist regulatory bodies in performing conformity assessments of AI systems using multilevel binomial classification models at businesses. It will also benefit businesses deploying multilevel models to be future-proof and aligned with the European Commission’s proposed Regulation on Artificial Intelligence.

Keywords: audit, multilevel model, model transparency, model explainability, discrimination, ethics

Procedia PDF Downloads 85
7076 Large Neural Networks Learning From Scratch With Very Few Data and Without Explicit Regularization

Authors: Christoph Linse, Thomas Martinetz

Abstract:

Recent findings have shown that Neural Networks generalize also in over-parametrized regimes with zero training error. This is surprising, since it is completely against traditional machine learning wisdom. In our empirical study we fortify these findings in the domain of fine-grained image classification. We show that very large Convolutional Neural Networks with millions of weights do learn with only a handful of training samples and without image augmentation, explicit regularization or pretraining. We train the architectures ResNet018, ResNet101 and VGG19 on subsets of the difficult benchmark datasets Caltech101, CUB_200_2011, FGVCAircraft, Flowers102 and StanfordCars with 100 classes and more, perform a comprehensive comparative study and draw implications for the practical application of CNNs. Finally, we show that VGG19 with 140 million weights learns to distinguish airplanes and motorbikes with up to 95% accuracy using only 20 training samples per class.

Keywords: convolutional neural networks, fine-grained image classification, generalization, image recognition, over-parameterized, small data sets

Procedia PDF Downloads 78
7075 Developing an Advanced Algorithm Capable of Classifying News, Articles and Other Textual Documents Using Text Mining Techniques

Authors: R. B. Knudsen, O. T. Rasmussen, R. A. Alphinas

Abstract:

The reason for conducting this research is to develop an algorithm that is capable of classifying news articles from the automobile industry, according to the competitive actions that they entail, with the use of Text Mining (TM) methods. It is needed to test how to properly preprocess the data for this research by preparing pipelines which fits each algorithm the best. The pipelines are tested along with nine different classification algorithms in the realm of regression, support vector machines, and neural networks. Preliminary testing for identifying the optimal pipelines and algorithms resulted in the selection of two algorithms with two different pipelines. The two algorithms are Logistic Regression (LR) and Artificial Neural Network (ANN). These algorithms are optimized further, where several parameters of each algorithm are tested. The best result is achieved with the ANN. The final model yields an accuracy of 0.79, a precision of 0.80, a recall of 0.78, and an F1 score of 0.76. By removing three of the classes that created noise, the final algorithm is capable of reaching an accuracy of 94%.

Keywords: Artificial Neural network, Competitive dynamics, Logistic Regression, Text classification, Text mining

Procedia PDF Downloads 116
7074 Enhancing the Interpretation of Group-Level Diagnostic Results from Cognitive Diagnostic Assessment: Application of Quantile Regression and Cluster Analysis

Authors: Wenbo Du, Xiaomei Ma

Abstract:

With the empowerment of Cognitive Diagnostic Assessment (CDA), various domains of language testing and assessment have been investigated to dig out more diagnostic information. What is noticeable is that most of the extant empirical CDA-based research puts much emphasis on individual-level diagnostic purpose with very few concerned about learners’ group-level performance. Even though the personalized diagnostic feedback is the unique feature that differentiates CDA from other assessment tools, group-level diagnostic information cannot be overlooked in that it might be more practical in classroom setting. Additionally, the group-level diagnostic information obtained via current CDA always results in a “flat pattern”, that is, the mastery/non-mastery of all tested skills accounts for the two highest proportion. In that case, the outcome does not bring too much benefits than the original total score. To address these issues, the present study attempts to apply cluster analysis for group classification and quantile regression analysis to pinpoint learners’ performance at different proficiency levels (beginner, intermediate and advanced) thus to enhance the interpretation of the CDA results extracted from a group of EFL learners’ reading performance on a diagnostic reading test designed by PELDiaG research team from a key university in China. The results show that EM method in cluster analysis yield more appropriate classification results than that of CDA, and quantile regression analysis does picture more insightful characteristics of learners with different reading proficiencies. The findings are helpful and practical for instructors to refine EFL reading curriculum and instructional plan tailored based on the group classification results and quantile regression analysis. Meanwhile, these innovative statistical methods could also make up the deficiencies of CDA and push forward the development of language testing and assessment in the future.

Keywords: cognitive diagnostic assessment, diagnostic feedback, EFL reading, quantile regression

Procedia PDF Downloads 141
7073 Modeling Engagement with Multimodal Multisensor Data: The Continuous Performance Test as an Objective Tool to Track Flow

Authors: Mohammad H. Taheri, David J. Brown, Nasser Sherkat

Abstract:

Engagement is one of the most important factors in determining successful outcomes and deep learning in students. Existing approaches to detect student engagement involve periodic human observations that are subject to inter-rater reliability. Our solution uses real-time multimodal multisensor data labeled by objective performance outcomes to infer the engagement of students. The study involves four students with a combined diagnosis of cerebral palsy and a learning disability who took part in a 3-month trial over 59 sessions. Multimodal multisensor data were collected while they participated in a continuous performance test. Eye gaze, electroencephalogram, body pose, and interaction data were used to create a model of student engagement through objective labeling from the continuous performance test outcomes. In order to achieve this, a type of continuous performance test is introduced, the Seek-X type. Nine features were extracted including high-level handpicked compound features. Using leave-one-out cross-validation, a series of different machine learning approaches were evaluated. Overall, the random forest classification approach achieved the best classification results. Using random forest, 93.3% classification for engagement and 42.9% accuracy for disengagement were achieved. We compared these results to outcomes from different models: AdaBoost, decision tree, k-Nearest Neighbor, naïve Bayes, neural network, and support vector machine. We showed that using a multisensor approach achieved higher accuracy than using features from any reduced set of sensors. We found that using high-level handpicked features can improve the classification accuracy in every sensor mode. Our approach is robust to both sensor fallout and occlusions. The single most important sensor feature to the classification of engagement and distraction was shown to be eye gaze. It has been shown that we can accurately predict the level of engagement of students with learning disabilities in a real-time approach that is not subject to inter-rater reliability, human observation or reliant on a single mode of sensor input. This will help teachers design interventions for a heterogeneous group of students, where teachers cannot possibly attend to each of their individual needs. Our approach can be used to identify those with the greatest learning challenges so that all students are supported to reach their full potential.

Keywords: affective computing in education, affect detection, continuous performance test, engagement, flow, HCI, interaction, learning disabilities, machine learning, multimodal, multisensor, physiological sensors, student engagement

Procedia PDF Downloads 82
7072 Comparison of Phenolic and Urushiol Contents of Different Parts of Rhus verniciflua and Their Antimicrobial Activity

Authors: Jae Young Jang, Jong Hoon Ahn, Jae-Woong Lim, So Young Kang, Mi Kyeong Lee

Abstract:

Rhus verniciflua is commonly known as a lacquer tree in Korea. Stem barks of R. verniciflua have been used as an immunostimulator in traditional medicine. It contains phenolic compounds and is known for diverse biological activities such as antioxidant and antimicrobial activity. However, it also causes allergic dermatitis due to urushiols derivatives. For the development of active natural resources with less toxicity, the content of phenolic compounds and urushiols of different parts of R. verniciflua such as stem barks, lignum and leaves were quantitated by colorimetric assay and HPLC analysis. The urushiols content were the highest in stem barks, and followed by leaves. The lignum contained trace amount of urushiols. The phenolic contents, however, were the most abundant in lignum, and followed by leaves and stem barks. These results clear showed that the content of urushiols and phenolic differs depending on the parts of R. verniciflua. Antimicrobial activity of different parts of R. verniciflua against fish pathogenic bacteria was also investigated using Edwardsiella tarda. Lignum of R. verniciflua was the most effective in antimicrobial activity against E. tarda and phenolic constituents are suggested to be active constituents for activity. Taken together, phenolic compounds are responsible for antimicrobial activity of R. verniciflua. The lignum of R. verniciflua contains high content of phenolic compounds with less urushiols, which suggests efficient antimicrobial activity with less toxicity. Therefore, lignum of R. verniciflua are suggested as good sources for antimicrobial activity against fish bacterial diseases.

Keywords: different parts, phenolic compounds, Rhus verniciflua, urushiols

Procedia PDF Downloads 310
7071 Improved Rare Species Identification Using Focal Loss Based Deep Learning Models

Authors: Chad Goldsworthy, B. Rajeswari Matam

Abstract:

The use of deep learning for species identification in camera trap images has revolutionised our ability to study, conserve and monitor species in a highly efficient and unobtrusive manner, with state-of-the-art models achieving accuracies surpassing the accuracy of manual human classification. The high imbalance of camera trap datasets, however, results in poor accuracies for minority (rare or endangered) species due to their relative insignificance to the overall model accuracy. This paper investigates the use of Focal Loss, in comparison to the traditional Cross Entropy Loss function, to improve the identification of minority species in the “255 Bird Species” dataset from Kaggle. The results show that, although Focal Loss slightly decreased the accuracy of the majority species, it was able to increase the F1-score by 0.06 and improve the identification of the bottom two, five and ten (minority) species by 37.5%, 15.7% and 10.8%, respectively, as well as resulting in an improved overall accuracy of 2.96%.

Keywords: convolutional neural networks, data imbalance, deep learning, focal loss, species classification, wildlife conservation

Procedia PDF Downloads 176
7070 Mitochondrial DNA Copy Number in Egyptian Patients with Hepatitis C Virus Related Hepatocellular Carcinoma

Authors: Doaa Hashad, Amany Elyamany, Perihan Salem

Abstract:

Introduction: Hepatitis C virus infection (HCV) constitutes a serious dilemma that has an impact on the health of millions of Egyptians. Hepatitis C virus related hepatocellular carcinoma (HCV-HCC) is a crucial consequence of HCV that represents the third cause of cancer-related deaths worldwide. Aim of the study: assess the use of mitochondrial DNA (mtDNA) content as a non-invasive molecular biomarker in hepatitis c virus related hepatocellular carcinoma (HCV-HCC). Methods: A total of 135 participants were enrolled in the study. Volunteers were assigned to one of three groups equally; a group of HCV related cirrhosis (HCV-cirrhosis), a group of HCV-HCC and a control group of age- and sex- matched healthy volunteers with no evidence of liver disease. mtDNA was determined using a quantitative real-time PCR technique. Results: mtDNA content was lowest in HCV-HCC cases. No statistically significant difference was observed between the group of HCV-cirrhosis and the control group as regards mtDNA level. HCC patients with multi-centric hepatic lesions had significantly lower mtDNA content. On using receiver operating characteristic curve analysis, a cutoff of 34 was assigned for mtDNA content to distinguish between HCV-HCC and HCV-cirrhosis patients who are not yet complicated by malignancy. Lower mtDNA was associated with greater HCC risk on using healthy controls, HCV-cirrhosis, or combining both groups as a reference group. Conclusions: mtDNA content might constitute a non-invasive molecular biomarker that reflects tumor burden in HCV-HCC cases and could be used as a predictor of HCC risk in patients of HCV-cirrhosis. In addition, the non significant difference of mtDNA level between HCV-cirrhosis patients and healthy controls could eliminate the grey zone created by the use of AFP in some cirrhotic patients.

Keywords: DNA copy number, HCC, HCV, mitochondrial

Procedia PDF Downloads 318
7069 Spatial Data Mining by Decision Trees

Authors: Sihem Oujdi, Hafida Belbachir

Abstract:

Existing methods of data mining cannot be applied on spatial data because they require spatial specificity consideration, as spatial relationships. This paper focuses on the classification with decision trees, which are one of the data mining techniques. We propose an extension of the C4.5 algorithm for spatial data, based on two different approaches Join materialization and Querying on the fly the different tables. Similar works have been done on these two main approaches, the first - Join materialization - favors the processing time in spite of memory space, whereas the second - Querying on the fly different tables- promotes memory space despite of the processing time. The modified C4.5 algorithm requires three entries tables: a target table, a neighbor table, and a spatial index join that contains the possible spatial relationship among the objects in the target table and those in the neighbor table. Thus, the proposed algorithms are applied to a spatial data pattern in the accidentology domain. A comparative study of our approach with other works of classification by spatial decision trees will be detailed.

Keywords: C4.5 algorithm, decision trees, S-CART, spatial data mining

Procedia PDF Downloads 606
7068 Oil Contents, Mineral Compositions, and Their Correlations in Wild and Cultivated Safflower Seeds

Authors: Rahim Ada, Mustafa Harmankaya, Sadiye Ayse Celik

Abstract:

The safflower seed contains about 25-40% solvent extract and 20-33% fiber. It is well known that dietary phospholipids lower serum cholesterol levels effectively. The nutrient composition of safflower seed changes depending on region, soil and genotypes. This research was made by using of six natural selected (A22, A29, A30, C12, E1, F4, G8, G12, J27) and three commercial (Remzibey, Dincer, Black Sun1) varieties of safflower genotypes. The research was conducted on field conditions for two years (2009 and 2010) in randomized complete block design with three replications in Konya-Turkey ecological conditions. Oil contents, mineral contents and their correlations were determined in the research. According to the results, oil content was ranged from 22.38% to 34.26%, while the minerals were in between the following values: 1469, 04-2068.07 mg kg-1 for Ca, 7.24-11.71 mg kg-1 for B, 13.29-17.41 mg kg-1 for Cu, 51.00-79.35 mg kg-1 for Fe, 3988-6638.34 mg kg-1 for K, 1418.61-2306.06 mg kg-1 for Mg, 11.37-17.76 mg kg-1 for Mn, 4172.33-7059.58 mg kg-1 for P and 32.60-59.00 mg kg-1 for Zn. Correlation analysis that was made separately for the commercial varieties and wild lines showed that high level of oil content was negatively affected by all the investigated minerals except for K and Zn in the commercial varieties.

Keywords: safflower, oil, quality, mineral content

Procedia PDF Downloads 260
7067 In vitro Antioxidant Activity of Derris scandens Extract

Authors: Nattawit Thiapairat

Abstract:

Multiple diseases have been linked to excessive levels of free radicals, which cause tissue or cell damage as a result of oxidative stress. Many plants are sources of high antioxidant activity. Derris scandens has a high amount of phenolic and flavonoid contents which demonstrated good biological activities. This study focused on the antioxidant activity of polyphenols extracted from D. scandens. This study performs total flavonoids content and various antioxidant assays, which were 2,2-diphenyl-1-picryl-hydrazyl-hydrate (DPPH) and 2,2'-azino-bis(3-ethylbenzothiazoline-6-sulfonic acid) (ABTS) radical scavenging capacity assays. The total flavonoid content of D. scandens extract was determined and expressed as quercetin equivalents (QE)/g measured by the aluminum chloride colorimetric method. The antioxidant activity of D. scandens extract was also determined by DPPH and ABTS assays. In the DPPH assay, vitamin C was used as a positive control, whereas Trolox was used as a positive control in the ABTS assay. The half-maximal inhibitory concentration (IC50) values for D. scandens extract from DPPH and ABTS assays were 41.79 μg/mL ± 0.783 and 29.42 μg/mL ± 0.890, respectively, in the DPPH assay. To conclude, D. scandens extract consists of a high amount of total phenolic content, which exhibits a significant antioxidant activity. However, further investigation regarding antioxidant activity such as SOD, ROS, and RNS scavenging assays and in vivo experiments should be performed.

Keywords: ABTS assay, antioxidant activity, Derris scandens, DPPH assays, total flavonoid content

Procedia PDF Downloads 203
7066 A Robust System for Foot Arch Type Classification from Static Foot Pressure Distribution Data Using Linear Discriminant Analysis

Authors: R. Periyasamy, Deepak Joshi, Sneh Anand

Abstract:

Foot posture assessment is important to evaluate foot type, causing gait and postural defects in all age groups. Although different methods are used for classification of foot arch type in clinical/research examination, there is no clear approach for selecting the most appropriate measurement system. Therefore, the aim of this study was to develop a system for evaluation of foot type as clinical decision-making aids for diagnosis of flat and normal arch based on the Arch Index (AI) and foot pressure distribution parameter - Power Ratio (PR) data. The accuracy of the system was evaluated for 27 subjects with age ranging from 24 to 65 years. Foot area measurements (hind foot, mid foot, and forefoot) were acquired simultaneously from foot pressure intensity image using portable PedoPowerGraph system and analysis of the image in frequency domain to obtain foot pressure distribution parameter - PR data. From our results, we obtain 100% classification accuracy of normal and flat foot by using the linear discriminant analysis method. We observe there is no misclassification of foot types because of incorporating foot pressure distribution data instead of only arch index (AI). We found that the mid-foot pressure distribution ratio data and arch index (AI) value are well correlated to foot arch type based on visual analysis. Therefore, this paper suggests that the proposed system is accurate and easy to determine foot arch type from arch index (AI), as well as incorporating mid-foot pressure distribution ratio data instead of physical area of contact. Hence, such computational tool based system can help the clinicians for assessment of foot structure and cross-check their diagnosis of flat foot from mid-foot pressure distribution.

Keywords: arch index, computational tool, static foot pressure intensity image, foot pressure distribution, linear discriminant analysis

Procedia PDF Downloads 492
7065 Modified Naive Bayes-Based Prediction Modeling for Crop Yield Prediction

Authors: Kefaya Qaddoum

Abstract:

Most of greenhouse growers desire a determined amount of yields in order to accurately meet market requirements. The purpose of this paper is to model a simple but often satisfactory supervised classification method. The original naive Bayes have a serious weakness, which is producing redundant predictors. In this paper, utilized regularization technique was used to obtain a computationally efficient classifier based on naive Bayes. The suggested construction, utilized L1-penalty, is capable of clearing redundant predictors, where a modification of the LARS algorithm is devised to solve this problem, making this method applicable to a wide range of data. In the experimental section, a study conducted to examine the effect of redundant and irrelevant predictors, and test the method on WSG data set for tomato yields, where there are many more predictors than data, and the urge need to predict weekly yield is the goal of this approach. Finally, the modified approach is compared with several naive Bayes variants and other classification algorithms (SVM and kNN), and is shown to be fairly good.

Keywords: tomato yield prediction, naive Bayes, redundancy, WSG

Procedia PDF Downloads 225
7064 Earthquake Classification in Molluca Collision Zone Using Conventional Statistical Methods

Authors: H. J. Wattimanela, U. S. Passaribu, A. N. T. Puspito, S. W. Indratno

Abstract:

Molluca Collision Zone is located at the junction of the Eurasian plate, Australian, Pacific, and the Philippines. Between the Sangihe arc, west of the collision zone, and to the east of Halmahera arc is active collision and convex toward the Molluca Sea. This research will analyze the behavior of earthquake occurrence in Molluca Collision Zone related to the distributions of an earthquake in each partition regions, determining the type of distribution of a occurrence earthquake of partition regions, and the mean occurrence of earthquakes each partition regions, and the correlation between the partitions region. We calculate number of earthquakes using partition method and its behavioral using conventional statistical methods. The data used is the data type of shallow earthquakes with magnitudes ≥ 4 SR for the period 1964-2013 in the Molluca Collision Zone. From the results, we can classify partitioned regions based on the correlation into two classes: strong and very strong. This classification can be used for early warning system in disaster management.

Keywords: molluca collision zone, partition regions, conventional statistical methods, earthquakes, classifications, disaster management

Procedia PDF Downloads 486
7063 Low Cost Technique for Measuring Luminance in Biological Systems

Authors: N. Chetty, K. Singh

Abstract:

In this work, the relationship between the melanin content in a tissue and subsequent absorption of light through that tissue was determined using a digital camera. This technique proved to be simple, cost effective, efficient and reliable. Tissue phantom samples were created using milk and soy sauce to simulate the optical properties of melanin content in human tissue. Increasing the concentration of soy sauce in the milk correlated to an increase in melanin content of an individual. Two methods were employed to measure the light transmitted through the sample. The first was direct measurement of the transmitted intensity using a conventional lux meter. The second method involved correctly calibrating an ordinary digital camera and using image analysis software to calculate the transmitted intensity through the phantom. The results from these methods were then graphically compared to the theoretical relationship between the intensity of transmitted light and the concentration of absorbers in the sample. Conclusions were then drawn about the effectiveness and efficiency of these low cost methods.

Keywords: tissue phantoms, scattering coefficient, albedo, low-cost method

Procedia PDF Downloads 264