Search results for: lexical similarity
Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 894

Search results for: lexical similarity

654 Low Cost Webcam Camera and GNSS Integration for Updating Home Data Using AI Principles

Authors: Mohkammad Nur Cahyadi, Hepi Hapsari Handayani, Agus Budi Raharjo, Ronny Mardianto, Daud Wahyu Imani, Arizal Bawazir, Luki Adi Triawan

Abstract:

PDAM (local water company) determines customer charges by considering the customer's building or house. Charges determination significantly affects PDAM income and customer costs because the PDAM applies a subsidy policy for customers classified as small households. Periodic updates are needed so that pricing is in line with the target. A thorough customer survey in Surabaya is needed to update customer building data. However, the survey that has been carried out so far has been by deploying officers to conduct one-by-one surveys for each PDAM customer. Surveys with this method require a lot of effort and cost. For this reason, this research offers a technology called moblie mapping, a mapping method that is more efficient in terms of time and cost. The use of this tool is also quite simple, where the device will be installed in the car so that it can record the surrounding buildings while the car is running. Mobile mapping technology generally uses lidar sensors equipped with GNSS, but this technology requires high costs. In overcoming this problem, this research develops low-cost mobile mapping technology using a webcam camera sensor added to the GNSS and IMU sensors. The camera used has specifications of 3MP with a resolution of 720 and a diagonal field of view of 78⁰. The principle of this invention is to integrate four camera sensors, a GNSS webcam, and GPS to acquire photo data, which is equipped with location data (latitude, longitude) and IMU (roll, pitch, yaw). This device is also equipped with a tripod and a vacuum cleaner to attach to the car's roof so it doesn't fall off while running. The output data from this technology will be analyzed with artificial intelligence to reduce similar data (Cosine Similarity) and then classify building types. Data reduction is used to eliminate similar data and maintain the image that displays the complete house so that it can be processed for later classification of buildings. The AI method used is transfer learning by utilizing a trained model named VGG-16. From the analysis of similarity data, it was found that the data reduction reached 50%. Then georeferencing is done using the Google Maps API to get address information according to the coordinates in the data. After that, geographic join is done to link survey data with customer data already owned by PDAM Surya Sembada Surabaya.

Keywords: mobile mapping, GNSS, IMU, similarity, classification

Procedia PDF Downloads 77
653 CT Medical Images Denoising Based on New Wavelet Thresholding Compared with Curvelet and Contourlet

Authors: Amir Moslemi, Amir movafeghi, Shahab Moradi

Abstract:

One of the most important challenging factors in medical images is nominated as noise.Image denoising refers to the improvement of a digital medical image that has been infected by Additive White Gaussian Noise (AWGN). The digital medical image or video can be affected by different types of noises. They are impulse noise, Poisson noise and AWGN. Computed tomography (CT) images are subjected to low quality due to the noise. The quality of CT images is dependent on the absorbed dose to patients directly in such a way that increase in absorbed radiation, consequently absorbed dose to patients (ADP), enhances the CT images quality. In this manner, noise reduction techniques on the purpose of images quality enhancement exposing no excess radiation to patients is one the challenging problems for CT images processing. In this work, noise reduction in CT images was performed using two different directional 2 dimensional (2D) transformations; i.e., Curvelet and Contourlet and Discrete wavelet transform(DWT) thresholding methods of BayesShrink and AdaptShrink, compared to each other and we proposed a new threshold in wavelet domain for not only noise reduction but also edge retaining, consequently the proposed method retains the modified coefficients significantly that result in good visual quality. Data evaluations were accomplished by using two criterions; namely, peak signal to noise ratio (PSNR) and Structure similarity (Ssim).

Keywords: computed tomography (CT), noise reduction, curve-let, contour-let, signal to noise peak-peak ratio (PSNR), structure similarity (Ssim), absorbed dose to patient (ADP)

Procedia PDF Downloads 434
652 The Automatisation of Dictionary-Based Annotation in a Parallel Corpus of Old English

Authors: Ana Elvira Ojanguren Lopez, Javier Martin Arista

Abstract:

The aims of this paper are to present the automatisation procedure adopted in the implementation of a parallel corpus of Old English, as well as, to assess the progress of automatisation with respect to tagging, annotation, and lemmatisation. The corpus consists of an aligned parallel text with word-for-word comparison Old English-English that provides the Old English segment with inflectional form tagging (gloss, lemma, category, and inflection) and lemma annotation (spelling, meaning, inflectional class, paradigm, word-formation and secondary sources). This parallel corpus is intended to fill a gap in the field of Old English, in which no parallel and/or lemmatised corpora are available, while the average amount of corpus annotation is low. With this background, this presentation has two main parts. The first part, which focuses on tagging and annotation, selects the layouts and fields of lexical databases that are relevant for these tasks. Most information used for the annotation of the corpus can be retrieved from the lexical and morphological database Nerthus and the database of secondary sources Freya. These are the sources of linguistic and metalinguistic information that will be used for the annotation of the lemmas of the corpus, including morphological and semantic aspects as well as the references to the secondary sources that deal with the lemmas in question. Although substantially adapted and re-interpreted, the lemmatised part of these databases draws on the standard dictionaries of Old English, including The Student's Dictionary of Anglo-Saxon, An Anglo-Saxon Dictionary, and A Concise Anglo-Saxon Dictionary. The second part of this paper deals with lemmatisation. It presents the lemmatiser Norna, which has been implemented on Filemaker software. It is based on a concordance and an index to the Dictionary of Old English Corpus, which comprises around three thousand texts and three million words. In its present state, the lemmatiser Norna can assign lemma to around 80% of textual forms on an automatic basis, by searching the index and the concordance for prefixes, stems and inflectional endings. The conclusions of this presentation insist on the limits of the automatisation of dictionary-based annotation in a parallel corpus. While the tagging and annotation are largely automatic even at the present stage, the automatisation of alignment is pending for future research. Lemmatisation and morphological tagging are expected to be fully automatic in the near future, once the database of secondary sources Freya and the lemmatiser Norna have been completed.

Keywords: corpus linguistics, historical linguistics, old English, parallel corpus

Procedia PDF Downloads 207
651 Modeling False Statements in Texts

Authors: Francielle A. Vargas, Thiago A. S. Pardo

Abstract:

According to the standard philosophical definition, lying is saying something that you believe to be false with the intent to deceive. For deception detection, the FBI trains its agents in a technique named statement analysis, which attempts to detect deception based on parts of speech (i.e., linguistics style). This method is employed in interrogations, where the suspects are first asked to make a written statement. In this poster, we model false statements using linguistics style. In order to achieve this, we methodically analyze linguistic features in a corpus of fake news in the Portuguese language. The results show that they present substantial lexical, syntactic and semantic variations, as well as punctuation and emotion distinctions.

Keywords: deception detection, linguistics style, computational linguistics, natural language processing

Procedia PDF Downloads 212
650 The Impact of Undisturbed Flow Speed on the Correlation of Aerodynamic Coefficients as a Function of the Angle of Attack for the Gyroplane Body

Authors: Zbigniew Czyz, Krzysztof Skiba, Miroslaw Wendeker

Abstract:

This paper discusses the results of aerodynamic investigation of the Tajfun gyroplane body designed by a Polish company, Aviation Artur Trendak. This gyroplane has been studied as a 1:8 scale model. Scaling objects for aerodynamic investigation is an inherent procedure in any kind of designing. If scaling, the criteria of similarity need to be satisfied. The basic criteria of similarity are geometric, kinematic and dynamic. Despite the results of aerodynamic research are often reduced to aerodynamic coefficients, one should pay attention to how values of coefficients behave if certain criteria are to be satisfied. To satisfy the dynamic criterion, for example, the Reynolds number should be focused on. This is the correlation of inertial to viscous forces. With the multiplied flow speed by the specific dimension as a numerator (with a constant kinematic viscosity coefficient), flow speed in a wind tunnel research should be increased as many times as an object is decreased. The aerodynamic coefficients specified in this research depend on the real forces that act on an object, its specific dimension, medium speed and variations in its density. Rapid prototyping with a 3D printer was applied to create the research object. The research was performed with a T-1 low-speed wind tunnel (its diameter of the measurement volume is 1.5 m) and a six-element aerodynamic internal scales, WDP1, at the Institute of Aviation in Warsaw. This T-1 wind tunnel is low-speed continuous operation with open space measurement. The research covered a number of the selected speeds of undisturbed flow, i.e. V = 20, 30 and 40 m/s, corresponding to the Reynolds numbers (as referred to 1 m) Re = 1.31∙106, 1.96∙106, 2.62∙106 for the angles of attack ranging -15° ≤ α ≤ 20°. Our research resulted in basic aerodynamic characteristics and observing the impact of undisturbed flow speed on the correlation of aerodynamic coefficients as a function of the angle of attack of the gyroplane body. If the speed of undisturbed flow in the wind tunnel changes, the aerodynamic coefficients are significantly impacted. At speed from 20 m/s to 30 m/s, drag coefficient, Cx, changes by 2.4% up to 9.9%, whereas lift coefficient, Cz, changes by -25.5% up to 15.7% if the angle of attack of 0° excluded or by -25.5% up to 236.9% if the angle of attack of 0° included. Within the same speed range, the coefficient of a pitching moment, Cmy, changes by -21.1% up to 7.3% if the angles of attack -15° and -10° excluded or by -142.8% up to 618.4% if the angle of attack -15° and -10° included. These discrepancies in the coefficients of aerodynamic forces definitely need to consider while designing the aircraft. For example, if load of certain aircraft surfaces is calculated, additional correction factors definitely need to be applied. This study allows us to estimate the discrepancies in the aerodynamic forces while scaling the aircraft. This work has been financed by the Polish Ministry of Science and Higher Education.

Keywords: aerodynamics, criteria of similarity, gyroplane, research tunnel

Procedia PDF Downloads 387
649 Organotin (IV) Based Complexes as Promiscuous Antibacterials: Synthesis in vitro, in Silico Pharmacokinetic, and Docking Studies

Authors: Wajid Rehman, Sirajul Haq, Bakhtiar Muhammad, Syed Fahad Hassan, Amin Badshah, Muhammad Waseem, Fazal Rahim, Obaid-Ur-Rahman Abid, Farzana Latif Ansari, Umer Rashid

Abstract:

Five novel triorganotin (IV) compounds have been synthesized and characterized. The tin atom is penta-coordinated to assume trigonal-bipyramidal geometry. Using in silico derived parameters; the objective of our study is to design and synthesize promiscuous antibacterials potent enough to combat resistance. Among various synthesized organotin (IV) complexes, compound 5 was found as potent antibacterial agent against various bacterial strains. Further lead optimization of drug-like properties was evaluated through in silico predictions. Data mining and computational analysis were utilized to derive compound promiscuity phenomenon to avoid drug attrition rate in designing antibacterials. Xanthine oxidase and human glucose- 6-phosphatase were found as only true positive off-target hits by ChEMBL database and others utilizing similarity ensemble approach. Propensity towards a-3 receptor, human macrophage migration factor and thiazolidinedione were found as false positive off targets with E-value 1/4> 10^-4 for compound 1, 3, and 4. Further, displaying positive drug-drug interaction of compound 1 as uricosuric was validated by all databases and docked protein targets with sequence similarity and compositional matrix alignment via BLAST software. Promiscuity of the compound 5 was further confirmed by in silico binding to different antibacterial targets.

Keywords: antibacterial activity, drug promiscuity, ADMET prediction, metallo-pharmaceutical, antimicrobial resistance

Procedia PDF Downloads 500
648 Japanese English in Travel Brochures

Authors: Premvadee Na Nakornpanom

Abstract:

This study investigates the role and impact of English loan words on Japanese language in travel brochures. The issues arising from a potential switch to English as a tool to absorb the West’s advanced knowledge and technology in the modernization of Japan to a means of linking Japan with the rest of the world and enhancing the country’s international presence. Sociolinguistic contexts were used to analyze data collected from the Nippon Travel agency "HIS"’s brochures in Thailand, revealing that English plays the most important role as lexical gap fillers and special effect givers. An increasing mixer of English to Japanese affects how English is misused, the way the Japanese see the world and the present generation’s communication gap.

Keywords: English, Japanese, loan words, travel brochure

Procedia PDF Downloads 233
647 Towards Law Data Labelling Using Topic Modelling

Authors: Daniel Pinheiro Da Silva Junior, Aline Paes, Daniel De Oliveira, Christiano Lacerda Ghuerren, Marcio Duran

Abstract:

The Courts of Accounts are institutions responsible for overseeing and point out irregularities of Public Administration expenses. They have a high demand for processes to be analyzed, whose decisions must be grounded on severity laws. Despite the existing large amount of processes, there are several cases reporting similar subjects. Thus, previous decisions on already analyzed processes can be a precedent for current processes that refer to similar topics. Identifying similar topics is an open, yet essential task for identifying similarities between several processes. Since the actual amount of topics is considerably large, it is tedious and error-prone to identify topics using a pure manual approach. This paper presents a tool based on Machine Learning and Natural Language Processing to assists in building a labeled dataset. The tool relies on Topic Modelling with Latent Dirichlet Allocation to find the topics underlying a document followed by Jensen Shannon distance metric to generate a probability of similarity between documents pairs. Furthermore, in a case study with a corpus of decisions of the Rio de Janeiro State Court of Accounts, it was noted that data pre-processing plays an essential role in modeling relevant topics. Also, the combination of topic modeling and a calculated distance metric over document represented among generated topics has been proved useful in helping to construct a labeled base of similar and non-similar document pairs.

Keywords: courts of accounts, data labelling, document similarity, topic modeling

Procedia PDF Downloads 172
646 RussiAnglicized© Slang and Translation: A Clockwork Orange Tick-Tock

Authors: Mahnaz Movahedi

Abstract:

Slang argot plays a fundamental role in Burgess’ teenage special sociolect in his novel A Clockwork Orange, offered a wide variety of instances to be analyzed. Consequently, translation of the notions and keeping the effect would be of great importance. Burgess named his interesting RussiAnglicized©-slang word as Nadsat, stands for –teen, mostly derived from Russian and Cockney rhyming. The paper discusses the lexical origin and Persian translation of his weird slang words illustrating a teenage-gang argot. The product depicts creativity but mistranslation that leads to the loss of slang meaning load and atmosphere in the target text.

Keywords: argot, mistranslation, slang, sociolect

Procedia PDF Downloads 249
645 Isolation and Identification of Probiotic Lactic Acid Bacteria with Cholesterol Lowering Potential and Their Use in Fermented Milk Product

Authors: Preeyarach Whisetkhan, Malai Taweechotipatr, Ulisa Pachekrepapol

Abstract:

Elevated level of blood cholesterol or hypercholesterolemia may lead to atherosclerosis and poses a major risk for cardiovascular diseases. Probiotics play a crucial role in human health, and probiotic bacteria that possesses bile salt hydrolase (BSH) activity can be used to lower cholesterol level of the host. The aim of this study was to investigate whether lactic acid bacteria (LAB) isolated from traditional Thai fermented foods were able to exhibit bile salt hydrolase activity and their use in fermented milk. A total of 28 isolates were tested for BSH activity by plate method on MRS agar supplemented with 0.5% sodium salt of taurodeoxycholic acid and incubated at 37°C for 48 h under anaerobic condition. The results showed that FN1-1 and FN23-3 isolates possessed strong BSH activity. FN1-1 and FN23-3 isolates were then identified for phenotype, biochemical characteristics, and genotype (16S rRNA sequencing). FN1-1 isolate showed 99.92% similarity to Lactobacillus pentosus DSM 20314(T), while FN23-3 isolate showed 99.94% similarity to Enterococcus faecium CGMCC1.2136 (T). Lactobacillus pentosus FN1-1 and Enterococcus faecium FN23-3 were tolerant of pH 3-4 and 0.3 and 0.8% bile. Bacterial count and pH of milk fermented with Lactobacillus pentosus FN1-1 at 37°C and 43°C were investigated. The results revealed that Lactobacillus pentosus FN1-1 was able to grow in milk, which led to decrease in pH level. Fermentation at 37°C resulted in faster growth rate than at 43 °C. Lactobacillus pentosus FN1-1 was a candidate probiotic to be used in fermented milk products to reduce the risk of high-cholesterol diseases.

Keywords: probiotics, lactic acid bacteria, bile salt hydrolase, cholesterol

Procedia PDF Downloads 146
644 Effects of Microbial Biofertilization on Nodulation, Nitrogen Fixation, and Yield of Lablab purpureus

Authors: Benselama Amel, Ounane S. Mohamed, Bekki Abdelkader

Abstract:

A collection of 20 isolates from fresh Nodules of the legume plant Lablab purpureus was isolated. These isolates have been authenticated by seedling inoculation grown in jars containing sand. The results obtained after two months of culture have revealed that the 20 isolates (100% of the isolates) are able to nodulate their host plants. The results obtained were analyzed statistically by ANOVA using the software statistica and had shown that the effect of the inoculation has significantly improved all the growth parameters (the height of the plant and the dry weight of the aerial parts and roots, and the number of nodules). We have evaluated the tolerance of all strains of the collection to the major stress factors as the salinity, pH and extreme temperature. The osmotolerance reached a concentration up to 1710mm of NaCl. The strains were also able to grow on a wide range of pH, ranging from 4.5 to 9.5, and temperature, between 4°C and 40°C. Also, we tested the effect of the acidity, aluminum and ferric deficit on the Lablab-rhizobia symbiosis. Lablab purpureus has not been affected by the presence of high concentrations of aluminum. On the other hand, iron deficiency has caused a net decrease in the dry biomass of the aerial part. The results of all the phenotypic characters have been treated by the statistical Minitab software, the numerical analysis had shown that these bacterial strains are divided into two distinct groups at a level of similarity of 86 %. The SDS-PAGE was carried out to determine the profile of the total protein of the strains. The coefficients of similarity of polypeptide bands between the isolates and strains reference (Bradyrhizobium, Mesorizobium sp.) confirm that our strain belongs to the groups of rhizobia.

Keywords: SDS-PAGE, rhizobia, symbiosis, phenotypic characterization, Lablab purpureus

Procedia PDF Downloads 302
643 A Spatial Information Network Traffic Prediction Method Based on Hybrid Model

Authors: Jingling Li, Yi Zhang, Wei Liang, Tao Cui, Jun Li

Abstract:

Compared with terrestrial network, the traffic of spatial information network has both self-similarity and short correlation characteristics. By studying its traffic prediction method, the resource utilization of spatial information network can be improved, and the method can provide an important basis for traffic planning of a spatial information network. In this paper, considering the accuracy and complexity of the algorithm, the spatial information network traffic is decomposed into approximate component with long correlation and detail component with short correlation, and a time series hybrid prediction model based on wavelet decomposition is proposed to predict the spatial network traffic. Firstly, the original traffic data are decomposed to approximate components and detail components by using wavelet decomposition algorithm. According to the autocorrelation and partial correlation smearing and truncation characteristics of each component, the corresponding model (AR/MA/ARMA) of each detail component can be directly established, while the type of approximate component modeling can be established by ARIMA model after smoothing. Finally, the prediction results of the multiple models are fitted to obtain the prediction results of the original data. The method not only considers the self-similarity of a spatial information network, but also takes into account the short correlation caused by network burst information, which is verified by using the measured data of a certain back bone network released by the MAWI working group in 2018. Compared with the typical time series model, the predicted data of hybrid model is closer to the real traffic data and has a smaller relative root means square error, which is more suitable for a spatial information network.

Keywords: spatial information network, traffic prediction, wavelet decomposition, time series model

Procedia PDF Downloads 141
642 On Early Verb Acquisition in Chinese-Speaking Children

Authors: Yating Mu

Abstract:

Young children acquire native language with amazing rapidity. After noticing this interesting phenomenon, lots of linguistics, as well as psychologists, devote themselves to exploring the best explanations. Thus researches on first language acquisition emerged. Early lexical development is an important branch of children’s FLA (first language acquisition). Verb, the most significant class of lexicon, the most grammatically complex syntactic category or word type, is not only the core of exploring syntactic structures of language but also plays a key role in analyzing semantic features. Obviously, early verb development must have great impacts on children’s early lexical acquisition. Most scholars conclude that verbs, in general, are very difficult to learn because the problem in verb learning might be more about mapping a specific verb onto an action or event than about learning the underlying relational concepts that the verb or relational term encodes. However, the previous researches on early verb development mainly focus on the argument about whether there is a noun-bias or verb-bias in children’s early productive vocabulary. There are few researches on general characteristics of children’s early verbs concerning both semantic and syntactic aspects, not mentioning a general survey on Chinese-speaking children’s verb acquisition. Therefore, the author attempts to examine the general conditions and characteristics of Chinese-speaking children’s early productive verbs, based on data from a longitudinal study on three Chinese-speaking children. In order to present an overall picture of Chinese verb development, both semantic and syntactic aspects will be focused in the present study. As for semantic analysis, a classification method is adopted first. Verb category is a sophisticated class in Mandarin, so it is quite necessary to divide it into small sub-types, thus making the research much easier. By making a reasonable classification of eight verb classes on basis of semantic features, the research aims at finding out whether there exist any universal rules in Chinese-speaking children’s verb development. With regard to the syntactic aspect of verb category, a debate between nativist account and usage-based approach has lasted for quite a long time. By analyzing the longitudinal Mandarin data, the author attempts to find out whether the usage-based theory can fully explain characteristics in Chinese verb development. To sum up, this thesis attempts to apply the descriptive research method to investigate the acquisition and the usage of Chinese-speaking children’s early verbs, on purpose of providing a new perspective in investigating semantic and syntactic features of early verb acquisition.

Keywords: Chinese-speaking children, early verb acquisition, verb classes, verb grammatical structures

Procedia PDF Downloads 361
641 Effect of Thermal Radiation and Chemical Reaction on MHD Flow of Blood in Stretching Permeable Vessel

Authors: Binyam Teferi

Abstract:

In this paper, a theoretical analysis of blood flow in the presence of thermal radiation and chemical reaction under the influence of time dependent magnetic field intensity has been studied. The unsteady non linear partial differential equations of blood flow considers time dependent stretching velocity, the energy equation also accounts time dependent temperature of vessel wall, and concentration equation includes time dependent blood concentration. The governing non linear partial differential equations of motion, energy, and concentration are converted into ordinary differential equations using similarity transformations solved numerically by applying ode45. MATLAB code is used to analyze theoretical facts. The effect of physical parameters viz., permeability parameter, unsteadiness parameter, Prandtl number, Hartmann number, thermal radiation parameter, chemical reaction parameter, and Schmidt number on flow variables viz., velocity of blood flow in the vessel, temperature and concentration of blood has been analyzed and discussed graphically. From the simulation study, the following important results are obtained: velocity of blood flow increases with both increment of permeability and unsteadiness parameter. Temperature of the blood increases in vessel wall as Prandtl number and Hartmann number increases. Concentration of the blood decreases as time dependent chemical reaction parameter and Schmidt number increases.

Keywords: stretching velocity, similarity transformations, time dependent magnetic field intensity, thermal radiation, chemical reaction

Procedia PDF Downloads 89
640 Trinary Affinity—Mathematic Verification and Application (1): Construction of Formulas for the Composite and Prime Numbers

Authors: Liang Ming Zhong, Yu Zhong, Wen Zhong, Fei Fei Yin

Abstract:

Trinary affinity is a description of existence: every object exists as it is known and spoken of, in a system of 2 differences (denoted dif1, dif₂) and 1 similarity (Sim), equivalently expressed as dif₁ / Sim / dif₂ and kn / 0 / tkn (kn = the known, tkn = the 'to be known', 0 = the zero point of knowing). They are mathematically verified and illustrated in this paper by the arrangement of all integers onto 3 columns, where each number exists as a difference in relation to another number as another difference, and the 2 difs as arbitrated by a third number as the Sim, resulting in a trinary affinity or trinity of 3 numbers, of which one is the known, the other the 'to be known', and the third the zero (0) from which both the kn and tkn are measured and specified. Consequently, any number is horizontally specified either as 3n, or as '3n – 1' or '3n + 1', and vertically as 'Cn + c', so that any number seems to occur at the intersection of its X and Y axes and represented by its X and Y coordinates, as any point on Earth’s surface by its latitude and longitude. Technically, i) primes are viewed and treated as progenitors, and composites as descending from them, forming families of composites, each capable of being measured and specified from its own zero called in this paper the realistic zero (denoted 0r, as contrasted to the mathematic zero, 0m), which corresponds to the constant c, and the nature of which separates the composite and prime numbers, and ii) any number is considered as having a magnitude as well as a position, so that a number is verified as a prime first by referring to its descriptive formula and then by making sure that no composite number can possibly occur on its position, by dividing it with factors provided by the composite number formulas. The paper consists of 3 parts: 1) a brief explanation of the trinary affinity of things, 2) the 8 formulas that represent ALL the primes, and 3) families of composite numbers, each represented by a formula. A composite number family is described as 3n + f₁‧f₂. Since there are an infinitely large number of composite number families, to verify the primality of a great probable prime, we have to have it divided with several or many a f₁ from a range of composite number formulas, a procedure that is as laborious as it is the surest way to verifying a great number’s primality. (So, it is possible to substitute planned division for trial division.)

Keywords: trinary affinity, difference, similarity, realistic zero

Procedia PDF Downloads 207
639 Auditory Perception of Frequency-Modulated Sweeps and Reading Difficulties in Chinese

Authors: Hsiao-Lan Wang, Chun-Han Chiang, I-Chen Chen

Abstract:

In Chinese Mandarin, lexical tones play an important role to provide contrasts in word meaning. They are pitch patterns and can be quantified as the fundamental frequency (F0), expressed in Hertz (Hz). In this study, we aim to investigate the influence of frequency discrimination on Chinese children’s performance of reading abilities. Fifty participants from 3rd to 4th grades, including 24 children with reading difficulties and 26 age-matched children, were examined. A serial of cognitive, language, reading and psychoacoustic tests were administrated. Magnetoencephalography (MEG) was also employed to study children’s auditory sensitivity. In the present study, auditory frequency was measured through slide-up pitch, slide-down pitch and frequency-modulated tone. The results showed that children with Chinese reading difficulties were significantly poor at phonological awareness and auditory discrimination for the identification of frequency-modulated tone. Chinese children’s character reading performance was significantly related to lexical tone awareness and auditory perception of frequency-modulated tone. In our MEG measure, we compared the mismatch negativity (MMNm), from 100 to 200 ms, in two groups. There were no significant differences between groups during the perceptual discrimination of standard sounds, fast-up and fast-down frequencies. However, the data revealed significant cluster differences between groups in the slow-up and slow-down frequencies discrimination. In the slow-up stimulus, the cluster demonstrated an upward field map at 106-151 ms (p < .001) with a strong peak time at 127ms. The source analyses of two dipole model and localization resolution model (CLARA) from 100 to 200 ms both indicated a strong source from the left temporal area with 45.845% residual variance. Similar results were found in the slow-down stimulus with a larger upward current at 110-142 ms (p < 0.05) and a peak time at 117 ms in the left temporal area (47.857% residual variance). In short, we found a significant group difference in the MMNm while children processed frequency-modulated tones with slow temporal changes. The findings may imply that perception of sound frequency signals with slower temporal modulations was related to reading and language development in Chinese. Our study may also support the recent hypothesis of underlying non-verbal auditory temporal deficits accounting for the difficulties in literacy development seen developmental dyslexia.

Keywords: Chinese Mandarin, frequency modulation sweeps, magnetoencephalography, mismatch negativity, reading difficulties

Procedia PDF Downloads 570
638 Relative Entropy Used to Determine the Divergence of Cells in Single Cell RNA Sequence Data Analysis

Authors: An Chengrui, Yin Zi, Wu Bingbing, Ma Yuanzhu, Jin Kaixiu, Chen Xiao, Ouyang Hongwei

Abstract:

Single cell RNA sequence (scRNA-seq) is one of the effective tools to study transcriptomics of biological processes. Recently, similarity measurement of cells is Euclidian distance or its derivatives. However, the process of scRNA-seq is a multi-variate Bernoulli event model, thus we hypothesize that it would be more efficient when the divergence between cells is valued with relative entropy than Euclidian distance. In this study, we compared the performances of Euclidian distance, Spearman correlation distance and Relative Entropy using scRNA-seq data of the early, medial and late stage of limb development generated in our lab. Relative Entropy is better than other methods according to cluster potential test. Furthermore, we developed KL-SNE, an algorithm modifying t-SNE whose definition of divergence between cells Euclidian distance to Kullback–Leibler divergence. Results showed that KL-SNE was more effective to dissect cell heterogeneity than t-SNE, indicating the better performance of relative entropy than Euclidian distance. Specifically, the chondrocyte expressing Comp was clustered together with KL-SNE but not with t-SNE. Surprisingly, cells in early stage were surrounded by cells in medial stage in the processing of KL-SNE while medial cells neighbored to late stage with the process of t-SNE. This results parallel to Heatmap which showed cells in medial stage were more heterogenic than cells in other stages. In addition, we also found that results of KL-SNE tend to follow Gaussian distribution compared with those of the t-SNE, which could also be verified with the analysis of scRNA-seq data from another study on human embryo development. Therefore, it is also an effective way to convert non-Gaussian distribution to Gaussian distribution and facilitate the subsequent statistic possesses. Thus, relative entropy is potentially a better way to determine the divergence of cells in scRNA-seq data analysis.

Keywords: Single cell RNA sequence, Similarity measurement, Relative Entropy, KL-SNE, t-SNE

Procedia PDF Downloads 337
637 Designing a Tool for Software Maintenance

Authors: Amir Ngah, Masita Abdul Jalil, Zailani Abdullah

Abstract:

The aim of software maintenance is to maintain the software system in accordance with advancement in software and hardware technology. One of the early works on software maintenance is to extract information at higher level of abstraction. In this paper, we present the process of how to design an information extraction tool for software maintenance. The tool can extract the basic information from old program such as about variables, based classes, derived classes, objects of classes, and functions. The tool have two main part; the lexical analyzer module that can read the input file character by character, and the searching module which is user can get the basic information from existing program. We implemented this tool for a patterned sub-C++ language as an input file.

Keywords: extraction tool, software maintenance, reverse engineering, C++

Procedia PDF Downloads 489
636 Data Mining Spatial: Unsupervised Classification of Geographic Data

Authors: Chahrazed Zouaoui

Abstract:

In recent years, the volume of geospatial information is increasing due to the evolution of communication technologies and information, this information is presented often by geographic information systems (GIS) and stored on of spatial databases (BDS). The classical data mining revealed a weakness in knowledge extraction at these enormous amounts of data due to the particularity of these spatial entities, which are characterized by the interdependence between them (1st law of geography). This gave rise to spatial data mining. Spatial data mining is a process of analyzing geographic data, which allows the extraction of knowledge and spatial relationships from geospatial data, including methods of this process we distinguish the monothematic and thematic, geo- Clustering is one of the main tasks of spatial data mining, which is registered in the part of the monothematic method. It includes geo-spatial entities similar in the same class and it affects more dissimilar to the different classes. In other words, maximize intra-class similarity and minimize inter similarity classes. Taking account of the particularity of geo-spatial data. Two approaches to geo-clustering exist, the dynamic processing of data involves applying algorithms designed for the direct treatment of spatial data, and the approach based on the spatial data pre-processing, which consists of applying clustering algorithms classic pre-processed data (by integration of spatial relationships). This approach (based on pre-treatment) is quite complex in different cases, so the search for approximate solutions involves the use of approximation algorithms, including the algorithms we are interested in dedicated approaches (clustering methods for partitioning and methods for density) and approaching bees (biomimetic approach), our study is proposed to design very significant to this problem, using different algorithms for automatically detecting geo-spatial neighborhood in order to implement the method of geo- clustering by pre-treatment, and the application of the bees algorithm to this problem for the first time in the field of geo-spatial.

Keywords: mining, GIS, geo-clustering, neighborhood

Procedia PDF Downloads 370
635 Investigating the Influences of Long-Term, as Compared to Short-Term, Phonological Memory on the Word Recognition Abilities of Arabic Readers vs. Arabic Native Speakers: A Word-Recognition Study

Authors: Insiya Bhalloo

Abstract:

It is quite common in the Muslim faith for non-Arabic speakers to be able to convert written Arabic, especially Quranic Arabic, into a phonological code without significant semantic or syntactic knowledge. This is due to prior experience learning to read the Quran (a religious text written in Classical Arabic), from a very young age such as via enrolment in Quranic Arabic classes. As compared to native speakers of Arabic, these Arabic readers do not have a comprehensive morpho-syntactic knowledge of the Arabic language, nor can understand, or engage in Arabic conversation. The study seeks to investigate whether mere phonological experience (as indicated by the Arabic readers’ experience with Arabic phonology and the sound-system) is sufficient to cause phonological-interference during word recognition of previously-heard words, despite the participants’ non-native status. Both native speakers of Arabic and non-native speakers of Arabic, i.e., those individuals that learned to read the Quran from a young age, will be recruited. Each experimental session will include two phases: An exposure phase and a test phase. During the exposure phase, participants will be presented with Arabic words (n=40) on a computer screen. Half of these words will be common words found in the Quran while the other half will be words commonly found in Modern Standard Arabic (MSA) but either non-existent or prevalent at a significantly lower frequency within the Quran. During the test phase, participants will then be presented with both familiar (n = 20; i.e., those words presented during the exposure phase) and novel Arabic words (n = 20; i.e., words not presented during the exposure phase. ½ of these presented words will be common Quranic Arabic words and the other ½ will be common MSA words but not Quranic words. Moreover, ½ the Quranic Arabic and MSA words presented will be comprised of nouns, while ½ the Quranic Arabic and MSA will be comprised of verbs, thereby eliminating word-processing issues affected by lexical category. Participants will then determine if they had seen that word during the exposure phase. This study seeks to investigate whether long-term phonological memory, such as via childhood exposure to Quranic Arabic orthography, has a differential effect on the word-recognition capacities of native Arabic speakers and Arabic readers; we seek to compare the effects of long-term phonological memory in comparison to short-term phonological exposure (as indicated by the presentation of familiar words from the exposure phase). The researcher’s hypothesis is that, despite the lack of lexical knowledge, early experience with converting written Quranic Arabic text into a phonological code will help participants recall the familiar Quranic words that appeared during the exposure phase more accurately than those that were not presented during the exposure phase. Moreover, it is anticipated that the non-native Arabic readers will also report more false alarms to the unfamiliar Quranic words, due to early childhood phonological exposure to Quranic Arabic script - thereby causing false phonological facilitatory effects.

Keywords: modern standard arabic, phonological facilitation, phonological memory, Quranic arabic, word recognition

Procedia PDF Downloads 350
634 A Hybrid Watermarking Scheme Using Discrete and Discrete Stationary Wavelet Transformation For Color Images

Authors: Bülent Kantar, Numan Ünaldı

Abstract:

This paper presents a new method which includes robust and invisible digital watermarking on images that is colored. Colored images are used as watermark. Frequency region is used for digital watermarking. Discrete wavelet transform and discrete stationary wavelet transform are used for frequency region transformation. Low, medium and high frequency coefficients are obtained by applying the two-level discrete wavelet transform to the original image. Low frequency coefficients are obtained by applying one level discrete stationary wavelet transform separately to all frequency coefficient of the two-level discrete wavelet transformation of the original image. For every low frequency coefficient obtained from one level discrete stationary wavelet transformation, watermarks are added. Watermarks are added to all frequency coefficients of two-level discrete wavelet transform. Totally, four watermarks are added to original image. In order to get back the watermark, the original and watermarked images are applied with two-level discrete wavelet transform and one level discrete stationary wavelet transform. The watermark is obtained from difference of the discrete stationary wavelet transform of the low frequency coefficients. A total of four watermarks are obtained from all frequency of two-level discrete wavelet transform. Obtained watermark results are compared with real watermark results, and a similarity result is obtained. A watermark is obtained from the highest similarity values. Proposed methods of watermarking are tested against attacks of the geometric and image processing. The results show that proposed watermarking method is robust and invisible. All features of frequencies of two level discrete wavelet transform watermarking are combined to get back the watermark from the watermarked image. Watermarks have been added to the image by converting the binary image. These operations provide us with better results in getting back the watermark from watermarked image by attacking of the geometric and image processing.

Keywords: watermarking, DWT, DSWT, copy right protection, RGB

Procedia PDF Downloads 534
633 Different Views and Evaluations of IT Artifacts

Authors: Sameh Al-Natour, Izak Benbasat

Abstract:

The introduction of a multitude of new and interactive e-commerce information technology (IT) artifacts has impacted adoption research. Rather than solely functioning as productivity tools, new IT artifacts assume the roles of interaction mediators and social actors. This paper describes the varying roles assumed by IT artifacts, and proposes and distinguishes between four distinct foci of how the artifacts are evaluated. It further proposes a theoretical model that maps the different views of IT artifacts to four distinct types of evaluations.

Keywords: IT adoption, IT artifacts, similarity, social actor

Procedia PDF Downloads 386
632 Estimating Estimators: An Empirical Comparison of Non-Invasive Analysis Methods

Authors: Yan Torres, Fernanda Simoes, Francisco Petrucci-Fonseca, Freddie-Jeanne Richard

Abstract:

The non-invasive samples are an alternative of collecting genetic samples directly. Non-invasive samples are collected without the manipulation of the animal (e.g., scats, feathers and hairs). Nevertheless, the use of non-invasive samples has some limitations. The main issue is degraded DNA, leading to poorer extraction efficiency and genotyping. Those errors delayed for some years a widespread use of non-invasive genetic information. Possibilities to limit genotyping errors can be done using analysis methods that can assimilate the errors and singularities of non-invasive samples. Genotype matching and population estimation algorithms can be highlighted as important analysis tools that have been adapted to deal with those errors. Although, this recent development of analysis methods there is still a lack of empirical performance comparison of them. A comparison of methods with dataset different in size and structure can be useful for future studies since non-invasive samples are a powerful tool for getting information specially for endangered and rare populations. To compare the analysis methods, four different datasets used were obtained from the Dryad digital repository were used. Three different matching algorithms (Cervus, Colony and Error Tolerant Likelihood Matching - ETLM) are used for matching genotypes and two different ones for population estimation (Capwire and BayesN). The three matching algorithms showed different patterns of results. The ETLM produced less number of unique individuals and recaptures. A similarity in the matched genotypes between Colony and Cervus was observed. That is not a surprise since the similarity between those methods on the likelihood pairwise and clustering algorithms. The matching of ETLM showed almost no similarity with the genotypes that were matched with the other methods. The different cluster algorithm system and error model of ETLM seems to lead to a more criterious selection, although the processing time and interface friendly of ETLM were the worst between the compared methods. The population estimators performed differently regarding the datasets. There was a consensus between the different estimators only for the one dataset. The BayesN showed higher and lower estimations when compared with Capwire. The BayesN does not consider the total number of recaptures like Capwire only the recapture events. So, this makes the estimator sensitive to data heterogeneity. Heterogeneity in the sense means different capture rates between individuals. In those examples, the tolerance for homogeneity seems to be crucial for BayesN work properly. Both methods are user-friendly and have reasonable processing time. An amplified analysis with simulated genotype data can clarify the sensibility of the algorithms. The present comparison of the matching methods indicates that Colony seems to be more appropriated for general use considering a time/interface/robustness balance. The heterogeneity of the recaptures affected strongly the BayesN estimations, leading to over and underestimations population numbers. Capwire is then advisable to general use since it performs better in a wide range of situations.

Keywords: algorithms, genetics, matching, population

Procedia PDF Downloads 139
631 Code-Switching as a Bilingual Phenomenon among Students in Prishtina International Schools

Authors: Festa Shabani

Abstract:

This paper aims at investigating bilingual speech in the International Schools of Prishtina. More particularly, it seeks to analyze bilingual phenomena among adolescent students highly exposed to English with the latter as the language of instruction at school in naturally-occurring conversations within school environment. Adolescence was deliberately chosen since it is regarded as an age when peer influence on language choice is the greatest. Driven by daily unsystematic observation and prior research already undertaken, the hypothesis stated is that Albanian continues to be the dominant language among Prishtina international schools’ students with a lot of code-switched items from the English. Furthermore, they will also use lexical borrowings - words already adapted in the receiving language, from the language they have been in contact with, in their speech often in the lack of existing equivalents in Albanian or for other reasons. This is done owing to the fact that the language of instruction at school is English, and any topic related to the language they have been exposed to will trigger them to use English. Therefore, this needs special attention in an attempt to identify patterns of their speech; in this way, linguistic and socio-pragmatic factors will be considered when analyzing the motivations behind their language choice. Methodology for collecting data include participant systematic observation and tape-recording. While observing them in their natural conversations, the fieldworker also took notes, which helped transcribe details better. The paper starts by raising the question of whether code-switching is occurring among Prishtina International Schools’ students highly exposed to English. The data gathered from students in informal settings suggests that there are well-founded grounds for an affirmative answer. The participants in this study are observed to be code-switching, although showing differences in degree. However, a generalization cannot be made on the basis of the findings except in so far it appears that English has, in turn, became a language to which they turn when identifying with the group when discussing about particular school topics. Particularly, participants seemed to use intra-sentential CS in cases when they seem to find an English expression rather easier than an Albanian one when repeating or emphasizing a point when urged to talk about educational issues with English being their language of instruction, and inter-sentential code-switching, particularly when quoting others. Concerning the grammatical aspect of code-switching, the intrasentential CS is used more than the intersentetial one. Speaking of gender, the results show that there were really no significant differences in regards quantity between male and female participants. However, the slight tendency for men to code switch intrasententially more than women was manifested. Similarly, a slight tendency again for a difference to emerge is on intersentential switching, which contributes 21% to the total number of switches for women, but 11% to the total number of switches for men.

Keywords: Albanian, code-switching contact linguistics, bilingual phenomena, lexical borrowing, English

Procedia PDF Downloads 123
630 Building an Arithmetic Model to Assess Visual Consistency in Townscape

Authors: Dheyaa Hussein, Peter Armstrong

Abstract:

The phenomenon of visual disorder is prominent in contemporary townscapes. This paper provides a theoretical framework for the assessment of visual consistency in townscape in order to achieve more favourable outcomes for users. In this paper, visual consistency refers to the amount of similarity between adjacent components of townscape. The paper investigates parameters which relate to visual consistency in townscape, explores the relationships between them and highlights their significance. The paper uses arithmetic methods from outside the domain of urban design to enable the establishment of an objective approach of assessment which considers subjective indicators including users’ preferences. These methods involve the standard of deviation, colour distance and the distance between points. The paper identifies urban space as a key representative of the visual parameters of townscape. It focuses on its two components, geometry and colour in the evaluation of the visual consistency of townscape. Accordingly, this article proposes four measurements. The first quantifies the number of vertices, which are points in the three-dimensional space that are connected, by lines, to represent the appearance of elements. The second evaluates the visual surroundings of urban space through assessing the location of their vertices. The last two measurements calculate the visual similarity in both vertices and colour in townscape by the calculation of their variation using methods including standard of deviation and colour difference. The proposed quantitative assessment is based on users’ preferences towards these measurements. The paper offers a theoretical basis for a practical tool which can alter the current understanding of architectural form and its application in urban space. This tool is currently under development. The proposed method underpins expert subjective assessment and permits the establishment of a unified framework which adds to creativity by the achievement of a higher level of consistency and satisfaction among the citizens of evolving townscapes.

Keywords: townscape, urban design, visual assessment, visual consistency

Procedia PDF Downloads 308
629 Identification of Analogues to EGCG for the Inhibition of HPV E7: A Fundamental Insights through Structural Dynamics Study

Authors: Murali Aarthy, Sanjeev Kumar Singh

Abstract:

High risk human papillomaviruses are highly associated with the carcinoma of the cervix and the other genital tumors. Cervical cancer develops through the multistep process in which increasingly severe premalignant dysplastic lesions called cervical intraepithelial neoplastic progress to invasive cancer. The oncoprotein E7 of human papillomavirus expressed in the lower epithelial layers drives the cells into S-phase creating an environment conducive for viral genome replication and cell proliferation. The replication of the virus occurs in the terminally differentiating epithelium and requires the activation of cellular DNA replication proteins. To date, no suitable drug molecule is available to treat HPV infection whereas identification of potential drug targets and development of novel anti-HPV chemotherapies with unique mode of actions are expected. Hence, our present study aimed to identify the potential inhibitors analogous to EGCG, a green tea molecule which is considered to be safe to use for mammalian systems. A 3D similarity search on the natural small molecule library from natural product database using EGCG identified 11 potential hits based on their similarity score. The structure based docking strategies were implemented in the potential hits and the key interacting residues of protein with compounds were identified through simulation studies and binding free energy calculations. The conformational changes between the apoprotein and the complex were analyzed with the simulation and the results demonstrated that the dynamical and structural effects observed in the protein were induced by the compounds and indicated the dominance to the oncoprotein. Overall, our study provides the basis for the structural insights of the identified potential hits and EGCG and hence, the analogous compounds identified can be potent inhibitors against the HPV 16 E7 oncoprotein.

Keywords: EGCG, oncoprotein, molecular dynamics simulation, analogues

Procedia PDF Downloads 123
628 Brown-Spot Needle Blight: An Emerging Threat Causing Loblolly Pine Needle Defoliation in Alabama, USA

Authors: Debit Datta, Jeffrey J. Coleman, Scott A. Enebak, Lori G. Eckhardt

Abstract:

Loblolly pine (Pinus taeda) is a leading productive timber species in the southeastern USA. Over the past three years, an emerging threat is expressed by successive needle defoliation followed by stunted growth and tree mortality in loblolly pine plantations. Considering economic significance, it has now become a rising concern among landowners, forest managers, and forest health state cooperators. However, the symptoms of the disease were perplexed somewhat with root disease(s) and recurrently attributed to invasive Phytophthora species due to the similarity of disease nature and devastation. Therefore, the study investigated the potential causal agent of this disease and characterized the fungi associated with loblolly pine needle defoliation in the southeastern USA. Besides, 70 trees were selected at seven long-term monitoring plots at Chatom, Alabama, to monitor and record the annual disease incidence and severity. Based on colony morphology and ITS-rDNA sequence data, a total of 28 species of fungi representing 17 families have been recovered from diseased loblolly pine needles. The native brown-spot pathogen, Lecanosticta acicola, was the species most frequently recovered from unhealthy loblolly pine needles in combination with some other common needle cast and rust pathogen(s). Identification was confirmed using morphological similarity and amplification of translation elongation factor 1-alpha gene region of interest. Tagged trees were consistently found chlorotic and defoliated from 2019 to 2020. The current emergence of the brown-spot pathogen causing loblolly pine mortality necessitates the investigation of the role of changing climatic conditions, which might be associated with increased pathogen pressure to loblolly pines in the southeastern USA.

Keywords: brown-spot needle blight, loblolly pine, needle defoliation, plantation forestry

Procedia PDF Downloads 149
627 Metacognitive Processing in Early Readers: The Role of Metacognition in Monitoring Linguistic and Non-Linguistic Performance and Regulating Students' Learning

Authors: Ioanna Taouki, Marie Lallier, David Soto

Abstract:

Metacognition refers to the capacity to reflect upon our own cognitive processes. Although there is an ongoing discussion in the literature on the role of metacognition in learning and academic achievement, little is known about its neurodevelopmental trajectories in early childhood, when children begin to receive formal education in reading. Here, we evaluate the metacognitive ability, estimated under a recently developed Signal Detection Theory model, of a cohort of children aged between 6 and 7 (N=60), who performed three two-alternative-forced-choice tasks (two linguistic: lexical decision task, visual attention span task, and one non-linguistic: emotion recognition task) including trial-by-trial confidence judgements. Our study has three aims. First, we investigated how metacognitive ability (i.e., how confidence ratings track accuracy in the task) relates to performance in general standardized tasks related to students' reading and general cognitive abilities using Spearman's and Bayesian correlation analysis. Second, we assessed whether or not young children recruit common mechanisms supporting metacognition across the different task domains or whether there is evidence for domain-specific metacognition at this early stage of development. This was done by examining correlations in metacognitive measures across different task domains and evaluating cross-task covariance by applying a hierarchical Bayesian model. Third, using robust linear regression and Bayesian regression models, we assessed whether metacognitive ability in this early stage is related to the longitudinal learning of children in a linguistic and a non-linguistic task. Notably, we did not observe any association between students’ reading skills and metacognitive processing in this early stage of reading acquisition. Some evidence consistent with domain-general metacognition was found, with significant positive correlations between metacognitive efficiency between lexical and emotion recognition tasks and substantial covariance indicated by the Bayesian model. However, no reliable correlations were found between metacognitive performance in the visual attention span and the remaining tasks. Remarkably, metacognitive ability significantly predicted children's learning in linguistic and non-linguistic domains a year later. These results suggest that metacognitive skill may be dissociated to some extent from general (i.e., language and attention) abilities and further stress the importance of creating educational programs that foster students’ metacognitive ability as a tool for long term learning. More research is crucial to understand whether these programs can enhance metacognitive ability as a transferable skill across distinct domains or whether unique domains should be targeted separately.

Keywords: confidence ratings, development, metacognitive efficiency, reading acquisition

Procedia PDF Downloads 147
626 Artificial Neural Network Approach for GIS-Based Soil Macro-Nutrients Mapping

Authors: Shahrzad Zolfagharnassab, Abdul Rashid Mohamed Shariff, Siti Khairunniza Bejo

Abstract:

Conventional methods for nutrient soil mapping are based on laboratory tests of samples that are obtained from surveys. The time and cost involved in gathering and analyzing soil samples are the reasons that researchers use Predictive Soil Mapping (PSM). PSM can be defined as the development of a numerical or statistical model of the relationship among environmental variables and soil properties, which is then applied to a geographic database to create a predictive map. Kriging is a group of geostatistical techniques to spatially interpolate point values at an unobserved location from observations of values at nearby locations. The main problem with using kriging as an interpolator is that it is excessively data-dependent and requires a large number of closely spaced data points. Hence, there is a need to minimize the number of data points without sacrificing the accuracy of the results. In this paper, an Artificial Neural Networks (ANN) scheme was used to predict macronutrient values at un-sampled points. ANN has become a popular tool for prediction as it eliminates certain difficulties in soil property prediction, such as non-linear relationships and non-normality. Back-propagation multilayer feed-forward network structures were used to predict nitrogen, phosphorous and potassium values in the soil of the study area. A limited number of samples were used in the training, validation and testing phases of ANN (pattern reconstruction structures) to classify soil properties and the trained network was used for prediction. The soil analysis results of samples collected from the soil survey of block C of Sawah Sempadan, Tanjung Karang rice irrigation project at Selangor of Malaysia were used. Soil maps were produced by the Kriging method using 236 samples (or values) that were a combination of actual values (obtained from real samples) and virtual values (neural network predicted values). For each macronutrient element, three types of maps were generated with 118 actual and 118 virtual values, 59 actual and 177 virtual values, and 30 actual and 206 virtual values, respectively. To evaluate the performance of the proposed method, for each macronutrient element, a base map using 236 actual samples and test maps using 118, 59 and 30 actual samples respectively produced by the Kriging method. A set of parameters was defined to measure the similarity of the maps that were generated with the proposed method, termed the sample reduction method. The results show that the maps that were generated through the sample reduction method were more accurate than the corresponding base maps produced through a smaller number of real samples. For example, nitrogen maps that were produced from 118, 59 and 30 real samples have 78%, 62%, 41% similarity, respectively with the base map (236 samples) and the sample reduction method increased similarity to 87%, 77%, 71%, respectively. Hence, this method can reduce the number of real samples and substitute ANN predictive samples to achieve the specified level of accuracy.

Keywords: artificial neural network, kriging, macro nutrient, pattern recognition, precision farming, soil mapping

Procedia PDF Downloads 68
625 Implementation of Algorithm K-Means for Grouping District/City in Central Java Based on Macro Economic Indicators

Authors: Nur Aziza Luxfiati

Abstract:

Clustering is partitioning data sets into sub-sets or groups in such a way that elements certain properties have shared property settings with a high level of similarity within one group and a low level of similarity between groups. . The K-Means algorithm is one of thealgorithmsclustering as a grouping tool that is most widely used in scientific and industrial applications because the basic idea of the kalgorithm is-means very simple. In this research, applying the technique of clustering using the k-means algorithm as a method of solving the problem of national development imbalances between regions in Central Java Province based on macroeconomic indicators. The data sample used is secondary data obtained from the Central Java Provincial Statistics Agency regarding macroeconomic indicator data which is part of the publication of the 2019 National Socio-Economic Survey (Susenas) data. score and determine the number of clusters (k) using the elbow method. After the clustering process is carried out, the validation is tested using themethodsBetween-Class Variation (BCV) and Within-Class Variation (WCV). The results showed that detection outlier using z-score normalization showed no outliers. In addition, the results of the clustering test obtained a ratio value that was not high, namely 0.011%. There are two district/city clusters in Central Java Province which have economic similarities based on the variables used, namely the first cluster with a high economic level consisting of 13 districts/cities and theclustersecondwith a low economic level consisting of 22 districts/cities. And in the cluster second, namely, between low economies, the authors grouped districts/cities based on similarities to macroeconomic indicators such as 20 districts of Gross Regional Domestic Product, with a Poverty Depth Index of 19 districts, with 5 districts in Human Development, and as many as Open Unemployment Rate. 10 districts.

Keywords: clustering, K-Means algorithm, macroeconomic indicators, inequality, national development

Procedia PDF Downloads 154