Search results for: syntax tree probing
435 A Near-Optimal Domain Independent Approach for Detecting Approximate Duplicates
Authors: Abdelaziz Fellah, Allaoua Maamir
Abstract:
We propose a domain-independent merging-cluster filter approach complemented with a set of algorithms for identifying approximate duplicate entities efficiently and accurately within a single and across multiple data sources. The near-optimal merging-cluster filter (MCF) approach is based on the Monge-Elkan well-tuned algorithm and extended with an affine variant of the Smith-Waterman similarity measure. Then we present constant, variable, and function threshold algorithms that work conceptually in a divide-merge filtering fashion for detecting near duplicates as hierarchical clusters along with their corresponding representatives. The algorithms take recursive refinement approaches in the spirit of filtering, merging, and updating, cluster representatives to detect approximate duplicates at each level of the cluster tree. Experiments show a high effectiveness and accuracy of the MCF approach in detecting approximate duplicates by outperforming the seminal Monge-Elkan’s algorithm on several real-world benchmarks and generated datasets.Keywords: data mining, data cleaning, approximate duplicates, near-duplicates detection, data mining applications and discovery
Procedia PDF Downloads 385434 Mineral Status of Feeds and Fodder and Its Subsequent Effect on Plasma of Livestock and Its Products in Red Lateritic Zone of West Bengal, India
Authors: S. K. Pyne, M. Mondal, G. Samanta
Abstract:
A survey was carried out in red lateritic zone of West Bengal to compare the mineral status in plasma of livestock grazing over red lateritic region. Sufficient number of samples of soil, feeds, fodder and blood were collected from four districts of red lateritic zone namely, West Midnapore, Birbhum, Bankura and Purulia respectively. The samples were analysed for Calcium (Ca), Phosphorus (P), Copper (Cu), Zinc (Zn), Manganese (Mn) and Iron (Fe). Concentration of Cu, Mn and Fe in soil were above the minimum critical level, whereas, Zn deficiency is wide spread in red lateritic soil. Paddy straw is deficient in Ca, P, Zn and Mn in the region. Green fodders are also deficient in P, Cu, Zn. The richness of iron (Fe) in soil, feeds, fodder and tree leaves is the characteristics of this region. Phosphorus is deficient in plasma of all categories of livestock with the exception of bullock. Cu is deficient in plasma of calf. Plasma Mn and Fe were higher (p<0.01) in the animals of red lateritic zone. The study reveals that the overall deficiency of phosphorus in different categories of livestock and there is need of dietary supplementation.Keywords: mineral, red lateritic zone, grazing livestock, plasma
Procedia PDF Downloads 328433 Cytochrome B Marker Reveals Three Distinct Genetic Lineages of the Oriental Latrine Fly Chrysomya megacephala (Diptera: Calliphoridae) in Malaysia
Authors: Rajagopal Kavitha, Van Lun Low, Mohd Sofian-Azirun, Chee Dhang Chen, Mohd Yusof Farida Zuraina, Mohd Salleh Ahmad Firdaus, Navaratnam Shanti, Abdul Haiyee Zaibunnisa
Abstract:
This study investigated the hidden genetic lineages in the oriental latrine fly Chrysomya megacephala (Fabricius) across four states (i.e., Johore, Pahang, Perak and Selangor) and a federal territory (i.e., Kuala Lumpur) in Malaysia using Cytochrome b (Cyt b) genetic marker. The Cyt b phylogenetic tree and haplotype network revealed three distinct genetic lineages of Ch. megacephala. Lineage A, the basal clade was restricted to flies that originated from Kuala Lumpur and Selangor, while Lineages B and C, comprised of flies from all studied populations. An overlap of the three genetically divergent groups of Ch. megacephala was observed. However, the flies from both Kuala Lumpur and Selangor populations consisted of three different lineages, indicating that they are genetically diverse compared to those from Pahang, Perak and Johore.Keywords: forensic entomology, calliphoridae, mitochondrial DNA, cryptic lineage
Procedia PDF Downloads 511432 Assessment of the Physical Quality of Eucalyptus Pellita Seedlings
Authors: Sharifah Insyirah, Noraliza A.
Abstract:
Eucalyptus pellita is a popular species of plantation tree in many nations and regions because of its fast growth and excellent timber qualities. Moreover, Eucalyptus leaves are known as forest harvesting waste with the potential to generate essential oils. Eucalyptus is one of the plants utilized in the pulp and paper industry. This study aims to investigate the impact of two parameters, which are types of fertilizer and polybags (black polybags and transparent polybags), on Eucalyptus growth performance in the nursery. The present investigation was carried out at Main Nursery, Forestry Research Institute Malaysia under agro-climatic and irrigation conditions of the nursery. Twenty seedlings were prepared for this study consisting of two treatments of eco-friendly soil conditioner and NPK (ratio of NPK 8:8:8). Survival and height measurements were collected accordingly. Seedlings without any treatment showed better growth than treatment with soil conditioner or NPK. Seedlings as in C1, shows consistently fastest growth compared to T1 (B) and T2 (SC), and the mortality rates were 0%, 15% and 5%, respectively. The results demonstrated that fertilizer and soil conditioner applied at a younger age of seedlings had less effect on growth performance.Keywords: eucalyptus pellita, potting media, high quality planting materials, nursery
Procedia PDF Downloads 27431 Hyperspectral Imagery for Tree Speciation and Carbon Mass Estimates
Authors: Jennifer Buz, Alvin Spivey
Abstract:
The most common greenhouse gas emitted through human activities, carbon dioxide (CO2), is naturally consumed by plants during photosynthesis. This process is actively being monetized by companies wishing to offset their carbon dioxide emissions. For example, companies are now able to purchase protections for vegetated land due-to-be clear cut or purchase barren land for reforestation. Therefore, by actively preventing the destruction/decay of plant matter or by introducing more plant matter (reforestation), a company can theoretically offset some of their emissions. One of the biggest issues in the carbon credit market is validating and verifying carbon offsets. There is a need for a system that can accurately and frequently ensure that the areas sold for carbon credits have the vegetation mass (and therefore for carbon offset capability) they claim. Traditional techniques for measuring vegetation mass and determining health are costly and require many person-hours. Orbital Sidekick offers an alternative approach that accurately quantifies carbon mass and assesses vegetation health through satellite hyperspectral imagery, a technique which enables us to remotely identify material composition (including plant species) and condition (e.g., health and growth stage). How much carbon a plant is capable of storing ultimately is tied to many factors, including material density (primarily species-dependent), plant size, and health (trees that are actively decaying are not effectively storing carbon). All of these factors are capable of being observed through satellite hyperspectral imagery. This abstract focuses on speciation. To build a species classification model, we matched pixels in our remote sensing imagery to plants on the ground for which we know the species. To accomplish this, we collaborated with the researchers at the Teakettle Experimental Forest. Our remote sensing data comes from our airborne “Kato” sensor, which flew over the study area and acquired hyperspectral imagery (400-2500 nm, 472 bands) at ~0.5 m/pixel resolution. Coverage of the entire teakettle experimental forest required capturing dozens of individual hyperspectral images. In order to combine these images into a mosaic, we accounted for potential variations of atmospheric conditions throughout the data collection. To do this, we ran an open source atmospheric correction routine called ISOFIT1 (Imaging Spectrometer Optiman FITting), which converted all of our remote sensing data from radiance to reflectance. A database of reflectance spectra for each of the tree species within the study area was acquired using the Teakettle stem map and the geo-referenced hyperspectral images. We found that a wide variety of machine learning classifiers were able to identify the species within our images with high (>95%) accuracy. For the most robust quantification of carbon mass and the best assessment of the health of a vegetated area, speciation is critical. Through the use of high resolution hyperspectral data, ground-truth databases, and complex analytical techniques, we are able to determine the species present within a pixel to a high degree of accuracy. These species identifications will feed directly into our carbon mass model.Keywords: hyperspectral, satellite, carbon, imagery, python, machine learning, speciation
Procedia PDF Downloads 124430 National Assessment for Schools in Saudi Arabia: Score Reliability and Plausible Values
Authors: Dimiter M. Dimitrov, Abdullah Sadaawi
Abstract:
The National Assessment for Schools (NAFS) in Saudi Arabia consists of standardized tests in Mathematics, Reading, and Science for school grade levels 3, 6, and 9. One main goal is to classify students into four categories of NAFS performance (minimal, basic, proficient, and advanced) by schools and the entire national sample. The NAFS scoring and equating is performed on a bounded scale (D-scale: ranging from 0 to 1) in the framework of the recently developed “D-scoring method of measurement.” The specificity of the NAFS measurement framework and data complexity presented both challenges and opportunities to (a) the estimation of score reliability for schools, (b) setting cut-scores for the classification of students into categories of performance, and (c) generating plausible values for distributions of student performance on the D-scale. The estimation of score reliability at the school level was performed in the framework of generalizability theory (GT), with students “nested” within schools and test items “nested” within test forms. The GT design was executed via a multilevel modeling syntax code in R. Cut-scores (on the D-scale) for the classification of students into performance categories was derived via a recently developed method of standard setting, referred to as “Response Vector for Mastery” (RVM) method. For each school, the classification of students into categories of NAFS performance was based on distributions of plausible values for the students’ scores on NAFS tests by grade level (3, 6, and 9) and subject (Mathematics, Reading, and Science). Plausible values (on the D-scale) for each individual student were generated via random selection from a statistical logit-normal distribution with parameters derived from the student’s D-score and its conditional standard error, SE(D). All procedures related to D-scoring, equating, generating plausible values, and classification of students into performance levels were executed via a computer program in R developed for the purpose of NAFS data analysis.Keywords: large-scale assessment, reliability, generalizability theory, plausible values
Procedia PDF Downloads 17429 Carbon Footprint Assessment Initiative and Trees: Role in Reducing Emissions
Authors: Omar Alelweet
Abstract:
Carbon emissions are quantified in terms of carbon dioxide equivalents, generated through a specific activity or accumulated throughout the life stages of a product or service. Given the growing concern about climate change and the role of carbon dioxide emissions in global warming, this initiative aims to create awareness and understanding of the impact of human activities and identify potential areas for improvement regarding the management of the carbon footprint on campus. Given that trees play a vital role in reducing carbon emissions by absorbing CO₂ during the photosynthesis process, this paper evaluated the contribution of each tree to reducing those emissions. Collecting data over an extended period of time is essential to monitoring carbon dioxide levels. This will help capture changes at different times and identify any patterns or trends in the data. By linking the data to specific activities, events, or environmental factors, it is possible to identify sources of emissions and areas where carbon dioxide levels are rising. Analyzing the collected data can provide valuable insights into ways to reduce emissions and mitigate the impact of climate change.Keywords: sustainability, green building, environmental impact, CO₂
Procedia PDF Downloads 68428 Outdoor Thermal Environment Measurement and Simulations in Traditional Settlements in Taiwan
Authors: Tzu-Ping Lin, Shing-Ru Yang
Abstract:
Climate change has a significant impact on human living environment, while the traditional settlement may suffer extreme thermal stress due to its specific building type and living behavior. This study selected Lutaoyang, which is the largest settlement in mountainous areas of Tainan County, for the investigation area. The microclimate parameters, such as air temperature, relative humidity, wind speed, and mean radiant temperature. The micro climate parameters were also simulated by the ENVI-met model. The results showed the banyan tree area providing good thermal comfort condition due to the shading. On the contrary, the courtyard (traditionally for the crops drying) surrounded by low rise building and consisted of artificial pavement contributing heat stress especially in summer noon. In the climate change simulations, the courtyard will become very hot and are not suitable for residents activities. These analytical results will shed light on the sustainability related to thermal environment in traditional settlements and develop adaptive measure towards sustainable development under the climate change challenges.Keywords: thermal environment, traditional settlement, ENVI-met, Taiwan
Procedia PDF Downloads 477427 The Genetic Diversity and Conservation Status of Natural Populus Nigra Populations in Turkey
Authors: Asiye Ciftci, Zeki Kaya
Abstract:
Populus nigra is one of the most economically and ecologically important forest trees in Turkey, well known for its rapid growth, good ability to vegetative propagation and the extreme uses of its wood. Due to overexploitation, loss of natural distribution area and extreme hybridization and introgression, Populus nigra is one of the most threatened tree species in Turkey and Europe. Using 20 nuclear microsatellite loci, the genetic structure of European black poplar populations along the two largest rivers of Turkey was analyzed. All tested loci were highly polymorphic, displaying 5 to 15 alleles per locus. Observed heterozygosity (overall Ho = 0.79) has been higher than the expected (overall He = 0.58) in each population. Low level of genetic differentiation among populations (FST= 0,03) and excess of heterozygotes for each river were found. Human-mediated dispersal, phenotypic selection, high level of gene flow and extensive circulations of clonal materials may cause those situations. The genetic data obtained from this study could provide the basis for efficient in situ and ex-situ conservation and restoration of species natural populations in its natural habitat as well as having sustainable breeding and poplar plantations in the future.Keywords: populus, clonal, loci, ex situ
Procedia PDF Downloads 291426 Study for an Optimal Cable Connection within an Inner Grid of an Offshore Wind Farm
Authors: Je-Seok Shin, Wook-Won Kim, Jin-O Kim
Abstract:
The offshore wind farm needs to be designed carefully considering economics and reliability aspects. There are many decision-making problems for designing entire offshore wind farm, this paper focuses on an inner grid layout which means the connection between wind turbines as well as between wind turbines and an offshore substation. A methodology proposed in this paper determines the connections and the cable type for each connection section using K-clustering, minimum spanning tree and cable selection algorithms. And then, a cost evaluation is performed in terms of investment, power loss and reliability. Through the cost evaluation, an optimal layout of inner grid is determined so as to have the lowest total cost. In order to demonstrate the validity of the methodology, the case study is conducted on 240MW offshore wind farm, and the results show that it is helpful to design optimally offshore wind farm.Keywords: offshore wind farm, optimal layout, k-clustering algorithm, minimum spanning algorithm, cable type selection, power loss cost, reliability cost
Procedia PDF Downloads 384425 Comparative Analysis of Predictive Models for Customer Churn Prediction in the Telecommunication Industry
Authors: Deepika Christopher, Garima Anand
Abstract:
To determine the best model for churn prediction in the telecom industry, this paper compares 11 machine learning algorithms, namely Logistic Regression, Support Vector Machine, Random Forest, Decision Tree, XGBoost, LightGBM, Cat Boost, AdaBoost, Extra Trees, Deep Neural Network, and Hybrid Model (MLPClassifier). It also aims to pinpoint the top three factors that lead to customer churn and conducts customer segmentation to identify vulnerable groups. According to the data, the Logistic Regression model performs the best, with an F1 score of 0.6215, 81.76% accuracy, 68.95% precision, and 56.57% recall. The top three attributes that cause churn are found to be tenure, Internet Service Fiber optic, and Internet Service DSL; conversely, the top three models in this article that perform the best are Logistic Regression, Deep Neural Network, and AdaBoost. The K means algorithm is applied to establish and analyze four different customer clusters. This study has effectively identified customers that are at risk of churn and may be utilized to develop and execute strategies that lower customer attrition.Keywords: attrition, retention, predictive modeling, customer segmentation, telecommunications
Procedia PDF Downloads 56424 On Panel Data Analysis of Factors on Economic Advances in Some African Countries
Authors: Ayoola Femi J., Kayode Balogun
Abstract:
In some African Countries, increase in Gross Domestic Products (GDP) has not translated to real development as expected by common-man in his household. For decades, a lot of contests on economic growth and development has been a nagging issues. The focus of this study is to analysing the effects of economic determinants/factors on economic advances in some African Countries by employing panel data analysis. The yearly (1990-2013) data were obtained from the world economic outlook database of the International Monetary Fund (IMF), for probing the effects of these variables on growth rate in some selected African countries which include: Nigeria, Algeria, Angola, Benin, Botswana, Burundi, Cape-Verde, Cameroun, Central African Republic, Chad, Republic Of Congo, Cote di’ Voire, Egypt, Equatorial-Guinea, Ethiopia, Gabon, Ghana, Guinea Bissau, Kenya, Lesotho, Madagascar, Mali, Mauritius, Morocco, Mozambique, Niger, Rwanda, Senegal, Seychelles, Sierra Leone, South Africa, Sudan, Swaziland, Tanzania, Togo, Tunisia, and Uganda. The effects of 6 macroeconomic variables on GDP were critically examined. We used 37 Countries GDP as our dependent variable and 6 independent variables used in this study include: Total Investment (totinv), Inflation (inf), Population (popl), current account balance (cab), volume of imports of goods and services (vimgs), and volume of exports of goods and services (vexgs). The results of our analysis shows that total investment, population and volume of exports of goods and services strongly affect the economic growth. We noticed that population of these selected countries positively affect the GDP while total investment and volume of exports negatively affect GDP. On the contrary, inflation, current account balance and volume of imports of goods and services’ contribution to the GDP are insignificant. The results of our analysis shows that total investment, population and volume of exports of goods and services strongly affect the economic growth. We noticed that population of these selected countries positively affect the GDP while total investment and volume of exports negatively affect GDP. On the contrary, inflation, current account balance and volume of imports of goods and services’ contribution to the GDP are insignificant. The results of this study would be useful for individual African governments for developing a suitable and appropriate economic policies and strategies. It will also help investors to understand the economic nature and viability of Africa as a continent as well as its individual countries.Keywords: African countries, economic growth and development, gross domestic products, static panel data models
Procedia PDF Downloads 475423 Developing Rice Disease Analysis System on Mobile via iOS Operating System
Authors: Rujijan Vichivanives, Kittiya Poonsilp, Canasanan Wanavijit
Abstract:
This research aims to create mobile tools to analyze rice disease quickly and easily. The principle of object-oriented software engineering and objective-C language were used for software development methodology and the principle of decision tree technique was used for analysis method. Application users can select the features of rice disease or the color appears on the rice leaves for recognition analysis results on iOS mobile screen. After completing the software development, unit testing and integrating testing method were used to check for program validity. In addition, three plant experts and forty farmers have been assessed for usability and benefit of this system. The overall of users’ satisfaction was found in a good level, 57%. The plant experts give a comment on the addition of various disease symptoms in the database for more precise results of the analysis. For further research, it is suggested that image processing system should be developed as a tool that allows users search and analyze for rice diseases more convenient with great accuracy.Keywords: rice disease, data analysis system, mobile application, iOS operating system
Procedia PDF Downloads 286422 A Comparison between Bèi Passives and Yóu Passives in Mandarin Chinese
Authors: Rui-heng Ray Huang
Abstract:
This study compares the syntax and semantics of two kinds of passives in Mandarin Chinese: bèi passives and yóu passives. To express a Chinese equivalent for ‘The thief was taken away by the police,’ either bèi or yóu can be used, as in Xiǎotōu bèi/yóu jǐngchá dàizǒu le. It is shown in this study that bèi passives and yóu passives differ semantically and syntactically. The semantic observations are based on the theta theory, dealing with thematic roles. On the other hand, the syntactic analysis draws heavily upon the generative grammar, looking into thematic structures. The findings of this study are as follows. First, the core semantics of bèi passives is centered on the Patient NP in the subject position. This Patient NP is essentially an Affectee, undergoing the outcome or consequence brought up by the action represented by the predicate. This may explain why in the sentence Wǒde huà bèi/*yóu tā niǔqū le ‘My words have been twisted by him/her,’ only bèi is allowed. This is because the subject NP wǒde huà ‘my words’ suffers a negative consequence. Yóu passives, in contrast, place the semantic focus on the post-yóu NP, which is not an Affectee though. Instead, it plays a role which has to take certain responsibility without being affected in a way like an Affectee. For example, in the sentence Zhèbù diànyǐng yóu/*bèi tā dānrèn dǎoyǎn ‘This film is directed by him/her,’ only the use of yóu is possible because the post-yóu NP tā ‘s/he’ refers to someone in charge, who is not an Affectee, nor is the sentence-initial NP zhèbù diànyǐng ‘this film’. When it comes to the second finding, the syntactic structures of bèi passives and yóu passives differ in that the former involve a two-place predicate while the latter a three-place predicate. The passive morpheme bèi in a case like Xiǎotōu bèi jǐngchá dàizǒu le ‘The thief was taken away by the police’ has been argued by some Chinese syntacticians to be a two-place predicate which selects an Experiencer subject and an Event complement. Under this analysis, the initial NP xiǎotōu ‘the thief’ in the above example is a base-generated subject. This study, however, proposes that yóu passives fall into a three-place unergative structure. In the sentence Xiǎotōu yóu jǐngchá dàizǒu le ‘The thief was taken away by the police,’ the initial NP xiǎotōu ‘the thief’ is a topic which serves as a Patient taken by the verb dàizǒu ‘take away.’ The subject of the sentence is assumed to be an Agent, which is in a null form and may find its reference from the discourse or world knowledge. Regarding the post-yóu NP jǐngchá ‘the police,’ its status is dual. On the one hand, it is a Patient introduced by the light verb yóu; on the other, it is an Agent assigned by the verb dàizǒu ‘take away.’ It is concluded that the findings in this study contribute to better understanding of what makes the distinction between the two kinds of Chinese passives.Keywords: affectee, passive, patient, unergative
Procedia PDF Downloads 273421 Using Data-Driven Model on Online Customer Journey
Authors: Ing-Jen Hung, Tzu-Chien Wang
Abstract:
Nowadays, customers can interact with firms through miscellaneous online ads on different channels easily. In other words, customer now has innumerable options and limitless time to accomplish their commercial activities with firms, individualizing their own online customer journey. This kind of convenience emphasizes the importance of online advertisement allocation on different channels. Therefore, profound understanding of customer behavior can make considerable benefit from optimizing fund allocation on diverse ad channels. To achieve this objective, multiple firms utilize numerical methodology to create data-driven advertisement policy. In our research, we aim to exploit online customer click data to discover the correlations between each channel and their sequential relations. We use LSTM to deal with sequential property of our data and compare its accuracy with other non-sequential methods, such as CART decision tree, logistic regression, etc. Besides, we also classify our customers into several groups by their behavioral characteristics to perceive the differences between all groups as customer portrait. As a result, we discover distinct customer journey under each customer portrait. Our article provides some insights into marketing research and can help firm to formulate online advertising criteria.Keywords: LSTM, customer journey, marketing, channel ads
Procedia PDF Downloads 120420 A Consideration of Dialectal and Stylistic Shifts in Literary Translation
Authors: Pushpinder Syal
Abstract:
Literary writing carries the stamp of the current language of its time. In translating such texts, it becomes a challenge to capture such reflections which may be evident at several levels: the level of dialectal use of language by characters in stories, the alterations in syntax as tools of writers’ individual stylistic choices, the insertion of quasi-proverbial and gnomic utterances, and even the level of the pragmatics of narrative discourse. Discourse strategies may differ between earlier and later texts, reflecting changing relationships between narrators and readers in changed cultural and social contexts. This paper is a consideration of these features by an approach that combines historicity with a description, contextualizing language change within a discourse framework. The process of translating a collection of writings of Punjabi literature spanning 100 years was undertaken for this study and it was observed that the factor of the historicity of language was seen to play a role. While intended for contemporary readers, the translation of literature over the span of a century poses the dual challenge of needing to possess both accessibility and immediacy as well as adherence to the 'old world' styles of communicating and narrating. The linguistic changes may be observed in a more obvious sense in the difference of diction and word formation – with evidence of more hybridized and borrowed forms in modern and contemporary writings, as compared to the older writings. The latter not only contain vestiges of proverbs and folk sayings, but are also closer to oral speech styles. These will be presented and analysed in the form of chronological listing and by these means, the social process of translation from orality to written text can be seen as traceable in the above-mentioned works. More subtle and underlying shifts can be seen through the analysis of speech acts and implicatures in the same literature, in which the social relationships underlying language use are evident as discourse systems of belief and understanding. They present distinct shifts in worldview as seen at different points in time. However, some continuities of language and style are also clearly visible, and these aid the translator in putting together a set of thematic links which identify the literature of a region and community, and constitute essential outcomes in the effort to preserve its distinctive nature.Keywords: cultural change, dialect, historicity, stylistic variation
Procedia PDF Downloads 129419 De Novo Assembly and Characterization of the Transcriptome during Seed Development, and Generation of Genic-SSR Markers in Pomegranate (Punica granatum L.)
Authors: Ozhan Simsek, Dicle Donmez, Burhanettin Imrak, Ahsen Isik Ozguven, Yildiz Aka Kacar
Abstract:
Pomegranate (Punica granatum L.) is known to be one of the oldest edible fruit tree species, with a wide geographical global distribution. Fruits from the two defined varieties (Hicaznar and 33N26) were taken at intervals after pollination and fertilization at different sizes. Seed samples were used for transcriptome sequencing. Primary sequencing was produced by Illumina Hi-Seq™ 2000. Firstly, we had raw reads, and it was subjected to quality control (QC). Raw reads were filtered into clean reads and aligned to the reference sequences. De novo analysis was performed to detect genes expressed in seeds of pomegranate varieties. We performed downstream analysis to determine differentially expressed genes. We generated about 27.09 gb bases in total after Illumina Hi-Seq sequencing. All samples were assembled together, we got 59,264 Unigenes, the total length, average length, N50, and GC content of Unigenes are 84.547.276 bp, 1.426 bp, 2,137 bp, and 46.20 %, respectively. Unigenes were annotated with 7 functional databases, finally, 42.681(NR: 72.02%), 39.660 (NT: 66.92%), 30.790 (Swissprot: 51.95%), 20.212 (COG: 34.11%), 27.689 (KEGG: 46.72%), 12.328 (GO: 20.80%), and 33,833 (Interpro: 57.09%) Unigenes were annotated. With functional annotation results, we detected 42.376 CDS, and 4.999 SSR distribute on 16.143 Unigenes.Keywords: next generation sequencing, SSR, RNA-Seq, Illumina
Procedia PDF Downloads 239418 Enhanced Extra Trees Classifier for Epileptic Seizure Prediction
Authors: Maurice Ntahobari, Levin Kuhlmann, Mario Boley, Zhinoos Razavi Hesabi
Abstract:
For machine learning based epileptic seizure prediction, it is important for the model to be implemented in small implantable or wearable devices that can be used to monitor epilepsy patients; however, current state-of-the-art methods are complex and computationally intensive. We use Shapley Additive Explanation (SHAP) to find relevant intracranial electroencephalogram (iEEG) features and improve the computational efficiency of a state-of-the-art seizure prediction method based on the extra trees classifier while maintaining prediction performance. Results for a small contest dataset and a much larger dataset with continuous recordings of up to 3 years per patient from 15 patients yield better than chance prediction performance (p < 0.004). Moreover, while the performance of the SHAP-based model is comparable to that of the benchmark, the overall training and prediction time of the model has been reduced by a factor of 1.83. It can also be noted that the feature called zero crossing value is the best EEG feature for seizure prediction. These results suggest state-of-the-art seizure prediction performance can be achieved using efficient methods based on optimal feature selection.Keywords: machine learning, seizure prediction, extra tree classifier, SHAP, epilepsy
Procedia PDF Downloads 111417 Glucose Monitoring System Using Machine Learning Algorithms
Authors: Sangeeta Palekar, Neeraj Rangwani, Akash Poddar, Jayu Kalambe
Abstract:
The bio-medical analysis is an indispensable procedure for identifying health-related diseases like diabetes. Monitoring the glucose level in our body regularly helps us identify hyperglycemia and hypoglycemia, which can cause severe medical problems like nerve damage or kidney diseases. This paper presents a method for predicting the glucose concentration in blood samples using image processing and machine learning algorithms. The glucose solution is prepared by the glucose oxidase (GOD) and peroxidase (POD) method. An experimental database is generated based on the colorimetric technique. The image of the glucose solution is captured by the raspberry pi camera and analyzed using image processing by extracting the RGB, HSV, LUX color space values. Regression algorithms like multiple linear regression, decision tree, RandomForest, and XGBoost were used to predict the unknown glucose concentration. The multiple linear regression algorithm predicts the results with 97% accuracy. The image processing and machine learning-based approach reduce the hardware complexities of existing platforms.Keywords: artificial intelligence glucose detection, glucose oxidase, peroxidase, image processing, machine learning
Procedia PDF Downloads 201416 An Ecological Reading of Indian Regional Literature: A Comparative Ecocritical Analysis of Punjabi Poet Shiv Kumar Batalvi and Surjit Patar's Poetry
Authors: Zameerpal Kaur
Abstract:
Ecocriticism comes into existence in 1990s, it tries to explore the relationship of literature with the natural world and further it examines the role that natural surroundings and environment play in the minds of the creative writers during their imagination and creative process. The present study is an attempt to focus on the comparative ecocritical analysis of Shiv Kumar Batalvi and Surjit Patar’s selected poetry in the theoretical framework of ecocriticism in order to shed light on the poet’s vigilant views about the relationship of human life and nature. Shiv Kumar Batalvi is a renowned modern Punjabi poet. He is essentially a poet of nature and love. His opinions towards nature support his position to be considered as a major representative of recent environmental issues and ecocritical concerns in Punjabi literature. He is one of the most outstanding modern Punjabi poets, is endowed with the most artistic temperament in whose poetry nature always has a dominating existence. He seems to consciously portray the scenes of natural surroundings into his poetry; in fact the titles of his poems in themselves signify his love for the nature. Surjit Patar, an imminent modern Punjabi poet tries to present a different picture of nature into his poems; he also uses to write poems about contemporary problems. Surjit Patar’s radical quarrel with the modern cultural context makes him reject all the absolutes and finalities in the form of transcendental reason and religion, history and evolution, he freely writes about the deterioration of nature at selfish materialistic society. He is modern poet who weaves the natural imagery with the syntax of his poems. Patar’s work reflects a universal voice that is dribbled with nuanced humanism and a sense of modernity that seemed neither dated, nor trapped in regional boundaries. Through his poetry he has given a voice to the fragile, disrupting borders, disturbing the status quo. An attempt to analyse the poetic works of above said poets from ecocritical perspective as well as especially focussing on various aspects of ecocriticism like ecocentric ethics, ecoaesthetics, anthropomorphism etc. has been made throughout the comparative study of the selected works.Keywords: anthropocentrism, degradation, environment and literature, nature
Procedia PDF Downloads 467415 Students’ Speech Anxiety in Blended Learning
Authors: Mary Jane B. Suarez
Abstract:
Public speaking anxiety (PSA), also known as speech anxiety, is innumerably persistent in any traditional communication classes, especially for students who learn English as a second language. The speech anxiety intensifies when communication skills assessments have taken their toll in an online or a remote mode of learning due to the perils of the COVID-19 virus. Both teachers and students have experienced vast ambiguity on how to realize a still effective way to teach and learn speaking skills amidst the pandemic. Communication skills assessments like public speaking, oral presentations, and student reporting have defined their new meaning using Google Meet, Zoom, and other online platforms. Though using such technologies has paved for more creative ways for students to acquire and develop communication skills, the effectiveness of using such assessment tools stands in question. This mixed method study aimed to determine the factors that affected the public speaking skills of students in a communication class, to probe on the assessment gaps in assessing speaking skills of students attending online classes vis-à-vis the implementation of remote and blended modalities of learning, and to recommend ways on how to address the public speaking anxieties of students in performing a speaking task online and to bridge the assessment gaps based on the outcome of the study in order to achieve a smooth segue from online to on-ground instructions maneuvering towards a much better post-pandemic academic milieu. Using a convergent parallel design, both quantitative and qualitative data were reconciled by probing on the public speaking anxiety of students and the potential assessment gaps encountered in an online English communication class under remote and blended learning. There were four phases in applying the convergent parallel design. The first phase was the data collection, where both quantitative and qualitative data were collected using document reviews and focus group discussions. The second phase was data analysis, where quantitative data was treated using statistical testing, particularly frequency, percentage, and mean by using Microsoft Excel application and IBM Statistical Package for Social Sciences (SPSS) version 19, and qualitative data was examined using thematic analysis. The third phase was the merging of data analysis results to amalgamate varying comparisons between desired learning competencies versus the actual learning competencies of students. Finally, the fourth phase was the interpretation of merged data that led to the findings that there was a significantly high percentage of students' public speaking anxiety whenever students would deliver speaking tasks online. There were also assessment gaps identified by comparing the desired learning competencies of the formative and alternative assessments implemented and the actual speaking performances of students that showed evidence that public speaking anxiety of students was not properly identified and processed.Keywords: blended learning, communication skills assessment, public speaking anxiety, speech anxiety
Procedia PDF Downloads 102414 Constant Factor Approximation Algorithm for p-Median Network Design Problem with Multiple Cable Types
Authors: Chaghoub Soraya, Zhang Xiaoyan
Abstract:
This research presents the first constant approximation algorithm to the p-median network design problem with multiple cable types. This problem was addressed with a single cable type and there is a bifactor approximation algorithm for the problem. To the best of our knowledge, the algorithm proposed in this paper is the first constant approximation algorithm for the p-median network design with multiple cable types. The addressed problem is a combination of two well studied problems which are p-median problem and network design problem. The introduced algorithm is a random sampling approximation algorithm of constant factor which is conceived by using some random sampling techniques form the literature. It is based on a redistribution Lemma from the literature and a steiner tree problem as a subproblem. This algorithm is simple, and it relies on the notions of random sampling and probability. The proposed approach gives an approximation solution with one constant ratio without violating any of the constraints, in contrast to the one proposed in the literature. This paper provides a (21 + 2)-approximation algorithm for the p-median network design problem with multiple cable types using random sampling techniques.Keywords: approximation algorithms, buy-at-bulk, combinatorial optimization, network design, p-median
Procedia PDF Downloads 201413 First Attempts Using High-Throughput Sequencing in Senecio from the Andes
Authors: L. Salomon, P. Sklenar
Abstract:
The Andes hold the highest plant species diversity in the world. How this occurred is one of the most intriguing questions in studies addressing the origin and patterning of plant diversity worldwide. Recently, the explosive adaptive radiations found in high Andean groups have been pointed as triggers to this spectacular diversity. The Andes is the species-richest area for the biggest genus from the Asteraceae family: Senecio. There, the genus presents an incredible diversity of species, striking growth form variation, and large niche span. Even when some studies tried to disentangle the evolutionary story for some Andean species in Senecio, they obtained partially resolved and low supported phylogenies, as expected for recently radiated groups. The high-throughput sequencing (HTS) approaches have proved to be a powerful tool answering phylogenetic questions in those groups whose evolutionary stories are recent and traditional techniques like Sanger sequencing are not informative enough. Although these tools have been used to understand the evolution of an increasing number of Andean groups, nowadays, their scope has not been applied for Senecio. This project aims to contribute to a better knowledge of the mechanisms shaping the hyper diversity of Senecio in the Andean region, using HTS focusing on Senecio ser. Culcitium (Asteraceae), recently recircumscribed. Firstly, reconstructing a highly resolved and supported phylogeny, and after assessing the role of allopatric differentiation, hybridization, and genome duplication in the diversification of the group. Using the Hyb-Seq approach, combining target enrichment using Asteraceae COS loci baits and genome skimming, more than 100 new accessions were generated. HybPhyloMaker and HybPiper pipelines were used for the phylogenetic analyses, and another pipeline in development (Paralogue Wizard) was used to deal with paralogues. RAxML was used to generate gene trees and Astral for species tree reconstruction. Phyparts were used to explore as first step of gene tree discordance along the clades. Fully resolved with moderated supported trees were obtained, showing Senecio ser. Culcitium as monophyletic. Within the group, some species formed well-supported clades with morphologically related species, while some species would not have exclusive ancestry, in concordance with previous studies using amplified fragment length polymorphism (AFLP) showing geographical differentiation. Discordance between gene trees was detected. Paralogues were detected for many loci, indicating possible genome duplications; ploidy level estimation using flow cytometry will be carried out during the next months in order to identify the role of this process in the diversification of the group. Likewise, TreeSetViz package for Mesquite, hierarchical likelihood ratio congruence test using Concaterpillar, and Procrustean Approach to Cophylogeny (PACo), will be used to evaluate the congruence among different inheritance patterns. In order to evaluate the influence of hybridization and Incomplete Lineage Sorting (ILS) in each resultant clade from the phylogeny, Joly et al.'s 2009 method in a coalescent scenario and Paterson’s D-statistic will be performed. Even when the main discordance sources between gene trees were not explored in detail yet, the data show that at least to some degree, processes such as genome duplication, hybridization, and/or ILS could be involved in the evolution of the group.Keywords: adaptive radiations, Andes, genome duplication, hybridization, Senecio
Procedia PDF Downloads 137412 Helping the Development of Public Policies with Knowledge of Criminal Data
Authors: Diego De Castro Rodrigues, Marcelo B. Nery, Sergio Adorno
Abstract:
The project aims to develop a framework for social data analysis, particularly by mobilizing criminal records and applying descriptive computational techniques, such as associative algorithms and extraction of tree decision rules, among others. The methods and instruments discussed in this work will enable the discovery of patterns, providing a guided means to identify similarities between recurring situations in the social sphere using descriptive techniques and data visualization. The study area has been defined as the city of São Paulo, with the structuring of social data as the central idea, with a particular focus on the quality of the information. Given this, a set of tools will be validated, including the use of a database and tools for visualizing the results. Among the main deliverables related to products and the development of articles are the discoveries made during the research phase. The effectiveness and utility of the results will depend on studies involving real data, validated both by domain experts and by identifying and comparing the patterns found in this study with other phenomena described in the literature. The intention is to contribute to evidence-based understanding and decision-making in the social field.Keywords: social data analysis, criminal records, computational techniques, data mining, big data
Procedia PDF Downloads 84411 System Survivability in Networks
Authors: Asma Ben Yaghlane, Mohamed Naceur Azaiez
Abstract:
We consider the problem of attacks on networks. We define the concept of system survivability in networks in the presence of intelligent threats. Our setting of the problem assumes a flow to be sent from one source node to a destination node. The attacker attempts to disable the network by preventing the flow to reach its destination while the defender attempts to identify the best path-set to use to maximize the chance of arrival of the flow to the destination node. Our concept is shown to be different from the classical concept of network reliability. We distinguish two types of network survivability related to the defender and to the attacker of the network, respectively. We prove that the defender-based-network survivability plays the role of a lower bound while the attacker-based-network survivability plays the role of an upper bound of network reliability. We also prove that both concepts almost never agree nor coincide with network reliability. Moreover, we use the shortest-path problem to determine the defender-based-network survivability and the min-cut problem to determine the attacker-based-network survivability. We extend the problem to a variety of models including the minimum-spanning-tree problem and the multiple source-/destination-network problems.Keywords: defense/attack strategies, information, networks, reliability, survivability
Procedia PDF Downloads 391410 Recursion, Merge and Event Sequence: A Bio-Mathematical Perspective
Authors: Noury Bakrim
Abstract:
Formalization is indeed a foundational Mathematical Linguistics as demonstrated by the pioneering works. While dialoguing with this frame, we nonetheless propone, in our approach of language as a real object, a mathematical linguistics/biosemiotics defined as a dialectical synthesis between induction and computational deduction. Therefore, relying on the parametric interaction of cycles, rules, and features giving way to a sub-hypothetic biological point of view, we first hypothesize a factorial equation as an explanatory principle within Category Mathematics of the Ergobrain: our computation proposal of Universal Grammar rules per cycle or a scalar determination (multiplying right/left columns of the determinant matrix and right/left columns of the logarithmic matrix) of the transformable matrix for rule addition/deletion and cycles within representational mapping/cycle heredity basing on the factorial example, being the logarithmic exponent or power of rule deletion/addition. It enables us to propone an extension of minimalist merge/label notions to a Language Merge (as a computing principle) within cycle recursion relying on combinatorial mapping of rules hierarchies on external Entax of the Event Sequence. Therefore, to define combinatorial maps as language merge of features and combinatorial hierarchical restrictions (governing, commanding, and other rules), we secondly hypothesize from our results feature/hierarchy exponentiation on graph representation deriving from Gromov's Symbolic Dynamics where combinatorial vertices from Fe are set to combinatorial vertices of Hie and edges from Fe to Hie such as for all combinatorial group, there are restriction maps representing different derivational levels that are subgraphs: the intersection on I defines pullbacks and deletion rules (under restriction maps) then under disjunction edges H such that for the combinatorial map P belonging to Hie exponentiation by intersection there are pullbacks and projections that are equal to restriction maps RM₁ and RM₂. The model will draw on experimental biomathematics as well as structural frames with focus on Amazigh and English (cases from phonology/micro-semantics, Syntax) shift from Structure to event (especially Amazigh formant principle resolving its morphological heterogeneity).Keywords: rule/cycle addition/deletion, bio-mathematical methodology, general merge calculation, feature exponentiation, combinatorial maps, event sequence
Procedia PDF Downloads 125409 A Study of the Challenges in Adoption of Renewable Energy in Nigeria
Authors: Farouq Sule Garo, Yahaya Yusuf
Abstract:
The purpose of this study is to investigate why there is a general lack of successful adoption of sustainable energy in Nigeria. This is particularly important given the current global campaign for net-zero emissions. The 26th United Nations Conference of the Parties (COP26), held in 2021, was hosted by the UK, in Glasgow, where, amongst other things, countries including Nigeria agreed to a zero emissions pact. There is, therefore, an obligation on the part of Nigeria for transition from fossil fuel-based economy to a sustainable net-zero emissions economy. The adoption of renewable energy is fundamental to achieving this ambitious target if decarbonisation of economic activities were to become a reality. Nigeria has an abundance of sources of renewable energy and yet there has been poor uptake and where attempts have been made to develop and harness renewable energy resources, there has been limited success. It is not entirely clear why this is the case. When analysts allude to corruption as the reason for failure for successful adoption of renewable energy or project implementation, it is arguable that corruption alone cannot explain the situation. Therefore, there is the need for a thorough investigation into the underlying issues surrounding poor uptake of renewable energy in Nigeria. This pilot study, drawing upon stakeholders’ theory, adopts a multi-stakeholder’ perspectives to investigate the influence and impacts of economic, political, technological, social factors in adoption of renewable energy in Nigeria. The research will also investigate how these factors shape (or fail to shape) strategies for achieving successful adoption of renewable energy in the country. A qualitative research methodology has been adopted given the nature of the research requiring in-depth studies in specific settings rather than a general population survey. There will be a number of interviews and each interview will allow thorough probing of sources. This, in addition to the six interviews that have already been conducted, primarily focused on economic dimensions of the challenges in adoption of renewable energy. The six participants in these initial interviews were all connected to the Katsina Wind Farm Project that was conceived and built with the view to diversifying Nigeria's energy mix and capitalise on the vast wind energy resources in the northern region. The findings from the six interviews provide insights into how the economic factors impacts on the wind farm project. Some key drivers have been identified, including strong governmental support and the recognition of the need for energy diversification. These drivers have played crucial roles in initiating and advancing the Katsina Wind Farm Project. In addition, the initial analysis has highlighted various challenges encountered during the project's implementation, including financial, regulatory, and environmental aspects. These challenges provide valuable lessons that can inform strategies to mitigate risks and improve future wind energy projects.Keywords: challenges in adoption of renewable energy, economic factors, net-zero emission, political factors
Procedia PDF Downloads 38408 A Linguistic Analysis of the Inconsistencies in the Meaning of Some -er Suffix Morphemes
Authors: Amina Abubakar
Abstract:
English like any other language is rich by means of arbitrary, conventional, symbols which lend it to lot of inconsistencies in spelling, phonology, syntax, and morphology. The research examines the irregularities prevalent in the structure and meaning of some ‘er’ lexical items in English and its implication to vocabulary acquisition. It centers its investigation on the derivational suffix ‘er’, which changes the grammatical category of word. English language poses many challenges to Second Language Learners because of its irregularities, exceptions, and rules. One of the meaning of –er derivational suffix is someone or somebody who does something. This rule often confuses the learners when they meet with the exceptions in normal discourse. The need to investigate instances of such inconsistencies in the formation of –er words and the meanings given to such words by the students motivated this study. For this purpose, some senior secondary two (SS2) students in six randomly selected schools in the metropolis were provided a large number of alphabetically selected ‘er’ suffix ending words, The researcher opts for a test technique, which requires them to provide the meaning of the selected words with- er. The marking of the test was scored on the scale of 1-0, where correct formation of –er word and meaning is scored one while wrong formation and meaning is scored zero. The number of wrong and correct formations of –er words meaning were calculated using percentage. The result of this research shows that a large number of students made wrong generalization of the meaning of the selected -er ending words. This shows how enormous the inconsistencies are in English language and how are affect the learning of English. Findings from the study revealed that though students mastered the basic morphological rules but the errors are generally committed on those vocabulary items that are not frequently in use. The study arrives at this conclusion from the survey of their textbook and their spoken activities. Therefore, the researcher recommends that there should be effective reappraisal of language teaching through implementation of the designed curriculum to reflect on modern strategies of teaching language, identification, and incorporation of the exceptions in rigorous communicative activities in language teaching, language course books and tutorials, training and retraining of teachers on the strategies that conform to the new pedagogy.Keywords: ESL(English as a second language), derivational morpheme, inflectional morpheme, suffixes
Procedia PDF Downloads 376407 Feature Extraction and Impact Analysis for Solid Mechanics Using Supervised Finite Element Analysis
Authors: Edward Schwalb, Matthias Dehmer, Michael Schlenkrich, Farzaneh Taslimi, Ketron Mitchell-Wynne, Horen Kuecuekyan
Abstract:
We present a generalized feature extraction approach for supporting Machine Learning (ML) algorithms which perform tasks similar to Finite-Element Analysis (FEA). We report results for estimating the Head Injury Categorization (HIC) of vehicle engine compartments across various impact scenarios. Our experiments demonstrate that models learned using features derived with a simple discretization approach provide a reasonable approximation of a full simulation. We observe that Decision Trees could be as effective as Neural Networks for the HIC task. The simplicity and performance of the learned Decision Trees could offer a trade-off of a multiple order of magnitude increase in speed and cost improvement over full simulation for a reasonable approximation. When used as a complement to full simulation, the approach enables rapid approximate feedback to engineering teams before submission for full analysis. The approach produces mesh independent features and is further agnostic of the assembly structure.Keywords: mechanical design validation, FEA, supervised decision tree, convolutional neural network.
Procedia PDF Downloads 139406 Meta-Learning for Hierarchical Classification and Applications in Bioinformatics
Authors: Fabio Fabris, Alex A. Freitas
Abstract:
Hierarchical classification is a special type of classification task where the class labels are organised into a hierarchy, with more generic class labels being ancestors of more specific ones. Meta-learning for classification-algorithm recommendation consists of recommending to the user a classification algorithm, from a pool of candidate algorithms, for a dataset, based on the past performance of the candidate algorithms in other datasets. Meta-learning is normally used in conventional, non-hierarchical classification. By contrast, this paper proposes a meta-learning approach for more challenging task of hierarchical classification, and evaluates it in a large number of bioinformatics datasets. Hierarchical classification is especially relevant for bioinformatics problems, as protein and gene functions tend to be organised into a hierarchy of class labels. This work proposes meta-learning approach for recommending the best hierarchical classification algorithm to a hierarchical classification dataset. This work’s contributions are: 1) proposing an algorithm for splitting hierarchical datasets into new datasets to increase the number of meta-instances, 2) proposing meta-features for hierarchical classification, and 3) interpreting decision-tree meta-models for hierarchical classification algorithm recommendation.Keywords: algorithm recommendation, meta-learning, bioinformatics, hierarchical classification
Procedia PDF Downloads 311