Search results for: data association.
Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 7617

Search results for: data association.

7497 Research of Data Cleaning Methods Based on Dependency Rules

Authors: Yang Bao, Shi Wei Deng, Wang Qun Lin

Abstract:

This paper introduces the concept and principle of data cleaning, analyzes the types and causes of dirty data, and proposes several key steps of typical cleaning process, puts forward a well scalability and versatility data cleaning framework, in view of data with attribute dependency relation, designs several of violation data discovery algorithms by formal formula, which can obtain inconsistent data to all target columns with condition attribute dependent no matter data is structured (SQL) or unstructured (NoSql), and gives 6 data cleaning methods based on these algorithms.

Keywords: Data cleaning, dependency rules, violation data discovery, data repair.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2612
7496 The Association between the Firm Characteristics and Corporate Mandatory Disclosure the Case of Greece

Authors: Despina Galani, Anastasios Alexandridis, Antonios Stavropoulos

Abstract:

The main thrust of this paper is to assess the level of disclosure in the annual reports of non-financial Greek firms and to empirically investigate the hypothesized impact of several firm characteristics on the extent of mandatory disclosure. A disclosure checklist consisting of 100 mandatory items was developed to assess the level of disclosure in the 2009 annual reports of 43 Greek companies listed at the Athens stock exchange. The association between the level of disclosure and some firm characteristics was examined using multiple linear regression analysis. The study reveals that Greek companies on general have responded adequately to the mandatory disclosure requirements of the regulatory bodies. The findings also indicate that firm size was significant positively associated with the level of disclosure. The remaining variables such as age, profitability, liquidity, and board composition were found to be insignificant in explaining the variation of mandatory disclosures. The outcome of this study is undoubtedly of great concern to the investment community at large to assist in evaluating the extent of mandatory disclosure by Greek firms and explaining the variation of disclosure in light of firm-specific characteristics.

Keywords: Mandatory disclosure, Annual report, Disclosure index

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 3980
7495 Coalescing Data Marts

Authors: N. Parimala, P. Pahwa

Abstract:

OLAP uses multidimensional structures, to provide access to data for analysis. Traditionally, OLAP operations are more focused on retrieving data from a single data mart. An exception is the drill across operator. This, however, is restricted to retrieving facts on common dimensions of the multiple data marts. Our concern is to define further operations while retrieving data from multiple data marts. Towards this, we have defined six operations which coalesce data marts. While doing so we consider the common as well as the non-common dimensions of the data marts.

Keywords: Data warehouse, Dimension, OLAP, Star Schema.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1559
7494 Prediction of the Solubility of Benzoic Acid in Supercritical CO2 Using the PC-SAFT EoS

Authors: Hamidreza Bagheri, Alireza Shariati

Abstract:

There are many difficulties in the purification of raw components and products. However, researchers are seeking better ways for purification. One of the recent methods is extraction using supercritical fluids. In this study, the phase equilibria of benzoic acid -supercritical carbon dioxide system were investigated. Regarding the phase equilibria of this system, the modeling of solid-supercritical fluid behavior was performed using the Perturbed-Chain Statistical Association Fluid Theory (PC-SAFT) and Peng-Robinson equations of state (PR EoS). For this purpose, five PC-SAFT EoS parameters for pure benzoic acid were obtained using its experimental vapor pressure. Benzoic acid has association sites and the behavior of the benzoic acid-supercritical fluid system was well predicted using both equations of state, while the binary interaction parameter values for PR EoS were negative. Genetic algorithm, which is one of the most accurate global optimization algorithms, was also used to optimize the pure benzoic acid parameters and the binary interaction parameters. The AAD% value for the PC-SAFT EoS, were 0.22 for the carbon dioxide-benzoic acid system.

Keywords: Supercritical fluids, Solubility, Solid, PC-SAFT EoS, Genetic algorithm.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2667
7493 Automatic Building an Extensive Arabic FA Terms Dictionary

Authors: El-Sayed Atlam, Masao Fuketa, Kazuhiro Morita, Jun-ichi Aoe

Abstract:

Field Association (FA) terms are a limited set of discriminating terms that give us the knowledge to identify document fields which are effective in document classification, similar file retrieval and passage retrieval. But the problem lies in the lack of an effective method to extract automatically relevant Arabic FA Terms to build a comprehensive dictionary. Moreover, all previous studies are based on FA terms in English and Japanese, and the extension of FA terms to other language such Arabic could be definitely strengthen further researches. This paper presents a new method to extract, Arabic FA Terms from domain-specific corpora using part-of-speech (POS) pattern rules and corpora comparison. Experimental evaluation is carried out for 14 different fields using 251 MB of domain-specific corpora obtained from Arabic Wikipedia dumps and Alhyah news selected average of 2,825 FA Terms (single and compound) per field. From the experimental results, recall and precision are 84% and 79% respectively. Therefore, this method selects higher number of relevant Arabic FA Terms at high precision and recall.

Keywords: Arabic Field Association Terms, information extraction, document classification, information retrieval.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1734
7492 The Truth about Good and Evil: A Mixed-Methods Approach to Color Theory

Authors: Raniya Alsharif

Abstract:

The color theory of good and evil is the association of colors to the omnipresent concept of good and evil, where human behavior and perception can be highly influenced by seeing black and white, making these connotations almost dangerously distinctive where they can be very hard to distinguish. This theory is a human construct that dates back to ancient Egypt and has been used since then in almost all forms of communication and expression, such as art, fashion, literature, and religious manuscripts, helping the implantation of preconceived ideas that influence behavior and society. This is a mixed-methods research that uses both surveys to collect quantitative data related to the theory and a vignette to collect qualitative data by using a scenario where participants aged between 18-25 will style two characters of good and bad characteristics with color contrasting clothes, both yielding results about the nature of the preconceived perceptions associated with ‘black and white’ and ‘good and evil’, illustrating the important role of media and communications in human behavior and subconscious, and also uncover how far this theory goes in the age of social media enlightenment.

Keywords: Color perception, interpretivism, thematic analysis, vignettes.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1001
7491 The Potential Involvement of Platelet Indices in Insulin Resistance in Morbid Obese Children

Authors: Orkide Donma, Mustafa M. Donma

Abstract:

Association between insulin resistance (IR) and hematological parameters has long been a matter of interest. Within this context, body mass index (BMI), red blood cells, white blood cells and platelets were involved in this discussion. Parameters related to platelets associated with IR may be useful indicators for the identification of IR. Platelet indices such as mean platelet volume (MPV), platelet distribution width (PDW) and plateletcrit (PCT) are being questioned for their possible association with IR. The aim of this study was to investigate the association between platelet (PLT) count as well as PLT indices and the surrogate indices used to determine IR in morbid obese (MO) children. A total of 167 children participated in the study. Three groups were constituted. The number of cases was 34, 97 and 36 children in the normal BMI, MO and metabolic syndrome (MetS) groups, respectively. Sex- and age-dependent BMI-based percentile tables prepared by World Health Organization were used for the definition of morbid obesity. MetS criteria were determined. BMI values, homeostatic model assessment for IR (HOMA-IR), alanine transaminase-to-aspartate transaminase ratio (ALT/AST) and diagnostic obesity notation model assessment laboratory (DONMA-lab) index values were computed. PLT count and indices were analyzed using automated hematology analyzer. Data were collected for statistical analysis using SPSS for Windows. Arithmetic mean and standard deviation were calculated. Mean values of PLT-related parameters in both control and study groups were compared by one-way ANOVA followed by Tukey post hoc tests to determine whether a significant difference exists among the groups. The correlation analyses between PLT as well as IR indices were performed. Statistically significant difference was accepted as p-value < 0.05. Increased values were detected for PLT (p < 0.01) and PCT (p > 0.05) in MO group compared to those observed in children with N-BMI. Significant increases for PLT (p < 0.01) and PCT (p < 0.05) were observed in MetS group in comparison with the values obtained in children with N-BMI (p < 0.01). Significantly lower MPV and PDW values were obtained in MO group compared to the control group (p < 0.01). HOMA-IR (p < 0.05), DONMA-lab index (p < 0.001) and ALT/AST (p < 0.001) values in MO and MetS groups were significantly increased compared to the N-BMI group. On the other hand, DONMA-lab index values also differed between MO and MetS groups (p < 0.001). In the MO group, PLT was negatively correlated with MPV and PDW values. These correlations were not observed in the N-BMI group. None of the IR indices exhibited a correlation with PLT and PLT indices in the N-BMI group. HOMA-IR showed significant correlations both with PLT and PCT in the MO group. All of the three IR indices were well-correlated with each other in all groups. These findings point out the missing link between IR and PLT activation. In conclusion, PLT and PCT may be related to IR in addition to their identities as hemostasis markers during morbid obesity. Our findings have suggested that DONMA-lab index appears as the best surrogate marker for IR due to its discriminative feature between morbid obesity and MetS.

Keywords: Children, insulin resistance, metabolic syndrome, plateletcrit, platelet indices.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 674
7490 Mining Big Data in Telecommunications Industry: Challenges, Techniques, and Revenue Opportunity

Authors: Hoda A. Abdel Hafez

Abstract:

Mining big data represents a big challenge nowadays. Many types of research are concerned with mining massive amounts of data and big data streams. Mining big data faces a lot of challenges including scalability, speed, heterogeneity, accuracy, provenance and privacy. In telecommunication industry, mining big data is like a mining for gold; it represents a big opportunity and maximizing the revenue streams in this industry. This paper discusses the characteristics of big data (volume, variety, velocity and veracity), data mining techniques and tools for handling very large data sets, mining big data in telecommunication and the benefits and opportunities gained from them.

Keywords: Mining Big Data, Big Data, Machine learning, Data Streams, Telecommunication.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2480
7489 Comparative Analysis of Diverse Collection of Big Data Analytics Tools

Authors: S. Vidhya, S. Sarumathi, N. Shanthi

Abstract:

Over the past era, there have been a lot of efforts and studies are carried out in growing proficient tools for performing various tasks in big data. Recently big data have gotten a lot of publicity for their good reasons. Due to the large and complex collection of datasets it is difficult to process on traditional data processing applications. This concern turns to be further mandatory for producing various tools in big data. Moreover, the main aim of big data analytics is to utilize the advanced analytic techniques besides very huge, different datasets which contain diverse sizes from terabytes to zettabytes and diverse types such as structured or unstructured and batch or streaming. Big data is useful for data sets where their size or type is away from the capability of traditional relational databases for capturing, managing and processing the data with low-latency. Thus the out coming challenges tend to the occurrence of powerful big data tools. In this survey, a various collection of big data tools are illustrated and also compared with the salient features.

Keywords: Big data, Big data analytics, Business analytics, Data analysis, Data visualization, Data discovery.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 3775
7488 Forecasting Malaria Cases in Bujumbura

Authors: Hermenegilde Nkurunziza, Albrecht Gebhardt, Juergen Pilz

Abstract:

The focus in this work is to assess which method allows a better forecasting of malaria cases in Bujumbura ( Burundi) when taking into account association between climatic factors and the disease. For the period 1996-2007, real monthly data on both malaria epidemiology and climate in Bujumbura are described and analyzed. We propose a hierarchical approach to achieve our objective. We first fit a Generalized Additive Model to malaria cases to obtain an accurate predictor, which is then used to predict future observations. Various well-known forecasting methods are compared leading to different results. Based on in-sample mean average percentage error (MAPE), the multiplicative exponential smoothing state space model with multiplicative error and seasonality performed better.

Keywords: Burundi, Forecasting, Malaria, Regressionmodel, State space model.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1984
7487 Multi-labeled Data Expressed by a Set of Labels

Authors: Tetsuya Furukawa, Masahiro Kuzunishi

Abstract:

Collected data must be organized to be utilized efficiently, and hierarchical classification of data is efficient approach to organize data. When data is classified to multiple categories or annotated with a set of labels, users request multi-labeled data by giving a set of labels. There are several interpretations of the data expressed by a set of labels. This paper discusses which data is expressed by a set of labels by introducing orders for sets of labels and shows that there are four types of orders, which are characterized by whether the labels of expressed data includes every label of the given set of labels within the range of the set. Desirable properties of the orders, data is also expressed by the higher set of labels and different sets of labels express different data, are discussed for the orders.

Keywords: Classification Hierarchies, Multi-labeled Data, Multiple Classificaiton, Orders of Sets of Labels

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1304
7486 Transportation and Physical Development around Kumasi, Ghana

Authors: Justice K. Owusu-Ansah, Kevin O'Connor

Abstract:

This research explores the links between physical development and transportation infrastructure around Kumasi, Ghana. It utilizes census data as well as fieldwork and interviews carried out during July and December 2005. The results suggest that there is a weak association between transportation investments and physical development, and that recent housing has generally occurred in poorly accessible locations. Road investments have generally followed physical expansion rather than the reverse. Hence policies designed to manage the fast growth now occurring around Ghanaian cities should not focus exclusively on improving transportation infrastructure but also strengthening the underlying the traditional land management structures and the official land administrative institutions that operate within those structures.

Keywords: Housing, Kumasi, population, physical development, transportation, villages.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2163
7485 Combining the Deep Neural Network with the K-Means for Traffic Accident Prediction

Authors: Celso L. Fernando, Toshio Yoshii, Takahiro Tsubota

Abstract:

Understanding the causes of a road accident and predicting their occurrence is key to prevent deaths and serious injuries from road accident events. Traditional statistical methods such as the Poisson and the Logistics regressions have been used to find the association of the traffic environmental factors with the accident occurred; recently, an artificial neural network, ANN, a computational technique that learns from historical data to make a more accurate prediction, has emerged. Although the ability to make accurate predictions, the ANN has difficulty dealing with highly unbalanced attribute patterns distribution in the training dataset; in such circumstances, the ANN treats the minority group as noise. However, in the real world data, the minority group is often the group of interest; e.g., in the road traffic accident data, the events of the accident are the group of interest. This study proposes a combination of the k-means with the ANN to improve the predictive ability of the neural network model by alleviating the effect of the unbalanced distribution of the attribute patterns in the training dataset. The results show that the proposed method improves the ability of the neural network to make a prediction on a highly unbalanced distributed attribute patterns dataset; however, on an even distributed attribute patterns dataset, the proposed method performs almost like a standard neural network. 

Keywords: Accident risks estimation, artificial neural network, deep learning, K-mean, road safety.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 974
7484 Cloud Forest Characteristics of Khao Nan, Thailand

Authors: P. Sangarun, W. Srisang, K. Jaroensutasinee, M. Jaroensutasinee

Abstract:

A better understanding of cloud forest characteristic in a tropical montane cloud forest at Khao Nan, Nakhon Si Thammarat on climatic, vegetation, soil and hydrology were studied during 18-21 April 2007. The results showed that as air temperature at Sanyen cloud forest increased, the percent relative humidity decreased. The amount of solar radiation at Sanyen cloud forest had a positive association with the amount of solar radiation at Parah forest. The amount of solar radiation at Sanyen cloud forest was very low with a range of 0-19 W/m2. On the other hand, the amount of solar radiation at Parah forest was high with a range of 0-1000 W/m2. There was no difference between leaf width, leaf length, leaf thickness and leaf area with increasing in elevations. As the elevations increased, bush height and tree height decreased. There was no association between bush width and bush ratio with elevation. As the elevations increased, the percent epiphyte cover and the percent soil moisture increased but water temperature, conductivity, and dissolved oxygen decreased. The percent soil moistures and organic contents were higher at elevations above 900 m than elevations below.

Keywords: Cloud forest, climate, vegetation, soil, hydrology.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1873
7483 Spatial Distribution of Local Sheep Breeds in Antalya Province

Authors: Serife Gulden Yilmaz, Suleyman Karaman

Abstract:

Sheep breeding is important in terms of meeting both the demand of red meat consumption and the availability of industrial raw materials and the employment of the rural sector in Turkey. It is also very important to ensure the selection and continuity of the breeds that are raised in order to increase quality and productive products related to sheep breeding. The protection of local breeds and crossbreds also enables the development of the sector in the region and the reduction of imports. In this study, the data were obtained from the records of the Turkish Statistical Institute and Antalya Sheep & Goat Breeders' Association. Spatial distribution of sheep breeds in Antalya is reviewed statistically in terms of concentration at the local level for 2015 period spatially. For this reason; mapping, box plot, linear regression are used in this study. Concentration is introduced by means of studbook data on sheep breeding as locals and total sheep farm by mapping. It is observed that Pırlak breed (17.5%) and Merinos crossbreed (16.3%) have the highest concentration in the region. These breeds are respectively followed by Akkaraman breed (11%), Pirlak crossbreed (8%), Merinos breed (7.9%) Akkaraman crossbreed (7.9%) and Ivesi breed (7.2%).

Keywords: Antalya, sheep breeds, spatial distribution, local.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1231
7482 Contributory Factors to Diabetes Dietary Regimen Non Adherence in Adults with Diabetes

Authors: Okolie Uchenna, Ehiemere Ijeoma, Ezenduka Pauline, Ogbu Sylvester

Abstract:

A cross sectional survey design was used to collect data from 370 diabetic patients. Two instruments were used in obtaining data; in-depth interview guide and researchers- developed questionnaire. Fisher's exact test was used to investigate association between the identified factors and nonadherence. Factors identified were: socio-demographic factors such as: gender, age, marital status, educational level and occupation; psychosocial obstacles such as: non-affordability of prescribed diet, frustration due to the restriction, limited spousal support, feelings of deprivation, feeling that temptation is inevitable, difficulty in adhering in social gatherings and difficulty in revealing to host that one is diabetic; health care providers obstacles were: poor attitude of health workers, irregular diabetes education in clinics , limited number of nutrition education sessions/ inability of the patients to estimate the desired quantity of food, no reminder post cards or phone calls about upcoming patient appointments and delayed start of appointment / time wasting in clinics.

Keywords: Behavior change, diabetes mellitus, dietarymanagement, diet adherence.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 3432
7481 Role of Social Capital on Consumer Attitudes, Peer Influence and Behavioral Intentions: A Social Media Perspective

Authors: Qazi Mohammed Ahmed, Osman Sadiq Paracha, Iftikhar Hussain

Abstract:

The study aims to explore the unaddressed relationship between social capital and consumers’ underlying behavioral intentions. The study postulates that this association is mediated by the role of attitudes and peer influence. The research attains evidence from a usable sample of 673 responses. The majority consists of the young and energetic social media users of Pakistan that utilize virtual communities as a way of life. A variance based structural equation modeling has been applied through SmartPLS 3. The results reveal that social capital exerts a statistically supportive association with both attitudes and peer influence. Contrastingly, this predictor variable shows an insignificant linkage with behavioral intentions but this relationship is fully mediated by consumer attitudes and peer influence. The paper enhances marketing literature with respect to an unexplored society of Pakistan. It also provides a lens for the contemporary advertisers, in terms of supporting their social media campaigns with affiliative and cohesive elements. The study also identifies a series of predictor variables that could further be tested with attitudes, subjective norms and behavioral responses.

Keywords: Behavioral intentions, consumer attitudes, peer influence, social capital.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 596
7480 Safety Practices among Bus Operators during Wee Hour Operations

Authors: M.R. Osman, F. Abas, M.S. Noh, A. Mohamad Suffian, O. Ilhamah, H.Z. Zarir, A.B. Wahida, P. Noor Faradila, M.F. Siti Atiqah

Abstract:

Safety Health and Environment Code of Practice (SHE COP) was developed to help road transportation operators to manage its operation in a systematic and safe manner. A study was conducted to determine the effectiveness of SHE COP implementation during non-OPS period. The objective of the study is to evaluate the implementations of SHE COP among bus operators during wee hour operations. The data was collected by completing a set of checklist after observing the activities during pre departure, during the trip, and upon arrival. The results show that there are seven widely practiced SHE COP elements. 22% of the buses have average speed exceeding the maximum permissible speed on the highways (90 km/h), with 13% of the buses were travelling at the speed of more than 100 km/h. The statistical analysis shows that there is only one significant association which relates speeding with prior presence of enforcement officers.

Keywords: Safety practices, speeding, wee hour.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1563
7479 The Comparison of Data Replication in Distributed Systems

Authors: Iman Zangeneh, Mostafa Moradi, Ali Mokhtarbaf

Abstract:

The necessity of ever-increasing use of distributed data in computer networks is obvious for all. One technique that is performed on the distributed data for increasing of efficiency and reliablity is data rplication. In this paper, after introducing this technique and its advantages, we will examine some dynamic data replication. We will examine their characteristies for some overus scenario and the we will propose some suggestion for their improvement.

Keywords: data replication, data hiding, consistency, dynamicdata replication strategy

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1635
7478 Association of Sensory Processing and Cognitive Deficits in Children with Autism Spectrum Disorders – Pioneer Study in Saudi Arabia

Authors: Rana M. Zeina, Laila AL-Ayadhi, Shahid Bashir

Abstract:

The association between sensory problems and cognitive abilities has been studied in individuals with Autism Spectrum Disorders (ASDs). In this study, we used a Neuropsychological Test to evaluate memory and attention in ASDs children with sensory problems compared to the ASDs children without sensory problems. Four visual memory tests of Cambridge Neuropsychological Test Automated Battery (CANTAB) including Big/little circle (BLC), Simple Reaction Time (SRT) Intra /Extra dimensional set shift (IED), Spatial recognition memory (SRM), were administered to 14 ASDs children with sensory problems compared to 13 ASDs without sensory problems aged 3 to 12 with IQ of above 70. ASDs individuals with sensory problems performed worse than the ASDs group without sensory problems on comprehension, learning, reversal and simple reaction time tasks, and no significant difference between the two groups was recorded in terms of the visual memory and visual comprehension tasks. The findings of this study suggest that ASDs children with sensory problems are facing deficits in learning, comprehension, reversal, and speed of response to a stimulus.

Keywords: Visual memory, Attention, Autism Spectrum Disorders (ASDs).

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2535
7477 Implementation of an IoT Sensor Data Collection and Analysis Library

Authors: Jihyun Song, Kyeongjoo Kim, Minsoo Lee

Abstract:

Due to the development of information technology and wireless Internet technology, various data are being generated in various fields. These data are advantageous in that they provide real-time information to the users themselves. However, when the data are accumulated and analyzed, more various information can be extracted. In addition, development and dissemination of boards such as Arduino and Raspberry Pie have made it possible to easily test various sensors, and it is possible to collect sensor data directly by using database application tools such as MySQL. These directly collected data can be used for various research and can be useful as data for data mining. However, there are many difficulties in using the board to collect data, and there are many difficulties in using it when the user is not a computer programmer, or when using it for the first time. Even if data are collected, lack of expert knowledge or experience may cause difficulties in data analysis and visualization. In this paper, we aim to construct a library for sensor data collection and analysis to overcome these problems.

Keywords: Clustering, data mining, DBSCAN, k-means, k-medoids, sensor data.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2010
7476 Genotypic and Allelic Distribution of Polymorphic Variants of Gene SLC47A1 Leu125Phe (rs77474263) and Gly64Asp (rs77630697) and Their Association to the Clinical Response to Metformin in Adult Pakistani T2DM Patients

Authors: Sadaf Moeez, Madiha Khalid, Zoya Khalid, Sania Shaheen, Sumbul Khalid

Abstract:

Background: Inter-individual variation in response to metformin, which has been considered as a first line therapy for T2DM treatment is considerable. In the current study, it was aimed to investigate the impact of two genetic variants Leu125Phe (rs77474263) and Gly64Asp (rs77630697) in gene SLC47A1 on the clinical efficacy of metformin in T2DM Pakistani patients. Methods: The study included 800 T2DM patients (400 metformin responders and 400 metformin non-responders) along with 400 ethnically matched healthy individuals. The genotypes were determined by allele-specific polymerase chain reaction. In-silico analysis was done to confirm the effect of the two SNPs on the structure of genes. Association was statistically determined using SPSS software. Results: Minor allele frequency for rs77474263 and rs77630697 was 0.13 and 0.12. For SLC47A1 rs77474263 the homozygotes of one mutant allele ‘T’ (CT) of rs77474263 variant were fewer in metformin responders than metformin non-responders (29.2% vs. 35.5 %). Likewise, the efficacy was further reduced (7.2% vs. 4.0 %) in homozygotes of two copies of ‘T’ allele (TT). Remarkably, T2DM cases with two copies of allele ‘C’ (CC) had 2.11 times more probability to respond towards metformin monotherapy. For SLC47A1 rs77630697 the homozygotes of one mutant allele ‘A’ (GA) of rs77630697 variant were fewer in metformin responders than metformin non-responders (33.5% vs. 43.0 %). Likewise, the efficacy was further reduced (8.5% vs. 4.5%) in homozygotes of two copies of ‘A’ allele (AA). Remarkably, T2DM cases with two copies of allele ‘G’ (GG) had 2.41 times more probability to respond towards metformin monotherapy. In-silico analysis revealed that these two variants affect the structure and stability of their corresponding proteins. Conclusion: The present data suggest that SLC47A1 Leu125Phe (rs77474263) and Gly64Asp (rs77630697) polymorphisms were associated with the therapeutic response of metformin in T2DM patients of Pakistan.

Keywords: Diabetes, T2DM, SLC47A1, Pakistan, polymorphism.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 734
7475 Analysis of the Structural Fluctuation of the Permitted Building Areas and Housing Distribution Ratios - Focused on 5 Cities Including Bucheon

Authors: Cheon Sik Min, Hyeong Wook Song, Sook Yeon Shim, Hoon Chang

Abstract:

The purpose of this study was to analyze the correlation between permitted building areas and housing distribution ratios and their fluctuation, and test a distribution model during 3 successive governments in 5 cities including Bucheon in reference to the time series administrative data, and thereby, interpret the results of the analysis in association with the policies pursued by the successive governments to examine the structural fluctuation of permitted building areas and housing distribution ratios. In order to analyze the fluctuation of permitted building areas and housing distribution ratios during 3 successive governments and examine the cycles of the time series data, the spectral analysis was performed, and in order to analyze the correlation between permitted building areas and housing distribution ratios, the tabulation was performed to describe the correlations statistically, and in order to explain about differences of fluctuation distribution of permitted building areas and housing distribution ratios among 3 governments, the goodness of fit test was conducted.

Keywords: The Permitted Building Areas, Housing Distribution Ratios, the Structural Fluctuation.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1194
7474 Effects of Polyvictimization in Suicidal Ideation among Children and Adolescents in Chile

Authors: Oscar E. Cariceo

Abstract:

In Chile, there is a lack of evidence about the impact of polyvictimization on the emergence of suicidal thoughts among children and young people. Thus, this study aims to explore the association between the episodes of polyvictimization suffered by Chilean children and young people and the manifestation of signs related to suicidal tendencies. To achieve this purpose, secondary data from the First Polyvictimization Survey on Children and Adolescents of 2017 were analyzed, and a binomial logistic regression model was applied to establish the probability that young people are experiencing suicidal ideation episodes. The main findings show that women between the ages of 13 and 15 years, who are in seventh grade and second in subsidized schools, are more likely to express suicidal ideas, which increases if they have suffered different types of victimization, particularly physical violence, psychological aggression, and sexual abuse.

Keywords: Chile, polyvictimization, suicidal ideation, youth.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 594
7473 Government (Big) Data Ecosystem: Definition, Classification of Actors, and Their Roles

Authors: Syed Iftikhar Hussain Shah, Vasilis Peristeras, Ioannis Magnisalis

Abstract:

Organizations, including governments, generate (big) data that are high in volume, velocity, veracity, and come from a variety of sources. Public Administrations are using (big) data, implementing base registries, and enforcing data sharing within the entire government to deliver (big) data related integrated services, provision of insights to users, and for good governance. Government (Big) data ecosystem actors represent distinct entities that provide data, consume data, manipulate data to offer paid services, and extend data services like data storage, hosting services to other actors. In this research work, we perform a systematic literature review. The key objectives of this paper are to propose a robust definition of government (big) data ecosystem and a classification of government (big) data ecosystem actors and their roles. We showcase a graphical view of actors, roles, and their relationship in the government (big) data ecosystem. We also discuss our research findings. We did not find too much published research articles about the government (big) data ecosystem, including its definition and classification of actors and their roles. Therefore, we lent ideas for the government (big) data ecosystem from numerous areas that include scientific research data, humanitarian data, open government data, industry data, in the literature.

Keywords: Big data, big data ecosystem, classification of big data actors, big data actors roles, definition of government (big) data ecosystem, data-driven government, eGovernment, gaps in data ecosystems, government (big) data, public administration, systematic literature review.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2143
7472 Urbanization and Income Inequality in Thailand

Authors: Acumsiri Tantiakrnpanit

Abstract:

This paper aims to examine the relationship between urbanization and income inequality in Thailand during the period 2002–2020, using a panel of data for 76 provinces collected from Thailand’s National Statistical Office (Labor Force Survey: LFS), as well as geospatial data from the U.S. Air Force Defense Meteorological Satellite Program (DMSP) and the Visible Infrared Imaging Radiometer Suite Day/Night band (VIIRS-DNB) satellite for 19 selected years. This paper employs two different definitions to identify urban areas: 1) Urban areas defined by Thailand's National Statistical Office (LFS), and 2) Urban areas estimated using nighttime light data from the DMSP and VIIRS-DNB satellite. The second method includes two sub-categories: 2.1) Determining urban areas by calculating nighttime light density with a population density of 300 people per square kilometer, and 2.2) Calculating urban areas based on nighttime light density corresponding to a population density of 1,500 people per square kilometer. The empirical analysis based on Ordinary Least Squares (OLS), fixed effects, and random effects models reveals a consistent U-shaped relationship between income inequality and urbanization. The findings from the econometric analysis demonstrate that urbanization or population density has a significant and negative impact on income inequality. Moreover, the square of urbanization shows a statistically significant positive impact on income inequality. Additionally, there is a negative association between logarithmically transformed income and income inequality. This paper also proposes the inclusion of satellite imagery, geospatial data, and spatial econometric techniques in future studies to conduct quantitative analysis of spatial relationships.

Keywords: Income inequality, nighttime light, population density, Thailand, urbanization.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 127
7471 Imputation Technique for Feature Selection in Microarray Data Set

Authors: Younies Mahmoud, Mai Mabrouk, Elsayed Sallam

Abstract:

Analyzing DNA microarray data sets is a great challenge, which faces the bioinformaticians due to the complication of using statistical and machine learning techniques. The challenge will be doubled if the microarray data sets contain missing data, which happens regularly because these techniques cannot deal with missing data. One of the most important data analysis process on the microarray data set is feature selection. This process finds the most important genes that affect certain disease. In this paper, we introduce a technique for imputing the missing data in microarray data sets while performing feature selection.

Keywords: DNA microarray, feature selection, missing data, bioinformatics.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2791
7470 Automatic Real-Patient Medical Data De-Identification for Research Purposes

Authors: Petr Vcelak, Jana Kleckova

Abstract:

Our Medicine-oriented research is based on a medical data set of real patients. It is a security problem to share patient private data with peoples other than clinician or hospital staff. We have to remove person identification information from medical data. The medical data without private data are available after a de-identification process for any research purposes. In this paper, we introduce an universal automatic rule-based de-identification application to do all this stuff on an heterogeneous medical data. A patient private identification is replaced by an unique identification number, even in burnedin annotation in pixel data. The identical identification is used for all patient medical data, so it keeps relationships in a data. Hospital can take an advantage of a research feedback based on results.

Keywords: DASTA, De-identification, DICOM, Health Level Seven, Medical data, OCR, Personal data

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1642
7469 Modelling the States of Public Client Participation in Public Private Partnership Arrangements

Authors: Eisa A. Alsafran, Francis T. Edum-Fotwe, Wayne E. Lord

Abstract:

The degree to which a public client actively participates in Public Private Partnership (PPP) schemes, is seen as a determinant of the success of the arrangement, and in particular, efficiency in the delivery of the assets of any infrastructure development. The asset delivery is often an early barometer for judging the overall performance of the PPP. Currently, there are no defined descriptors for the degree of such participation. The lack of defined descriptors makes the association between the degree of participation and efficiency of asset delivery, difficult to establish. This is particularly so if an optimum effect is desired. In addition, such an association is important for the strategic decision to embark on any PPP initiative. This paper presents a conceptual model of different levels of participation that characterise PPP schemes. The modelling was achieved by a systematic review of reported sources that address essential aspects and structures of PPP schemes, published from 2001 to 2015. As a precursor to the modelling, the common areas of Public Client Participation (PCP) were investigated. Equity and risk emerged as two dominant factors in the common areas of PCP, and were therefore adopted to form the foundation of the modelling. The resultant conceptual model defines the different states of combined PCP. The defined states provide a more rational basis for establishing how the degree of PCP affects the efficiency of asset delivery in PPP schemes.

Keywords: Asset delivery, infrastructure development, public private partnership, public client participation.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1607
7468 Analyzing Multi-Labeled Data Based on the Roll of a Concept against a Semantic Range

Authors: Masahiro Kuzunishi, Tetsuya Furukawa, Ke Lu

Abstract:

Classifying data hierarchically is an efficient approach to analyze data. Data is usually classified into multiple categories, or annotated with a set of labels. To analyze multi-labeled data, such data must be specified by giving a set of labels as a semantic range. There are some certain purposes to analyze data. This paper shows which multi-labeled data should be the target to be analyzed for those purposes, and discusses the role of a label against a set of labels by investigating the change when a label is added to the set of labels. These discussions give the methods for the advanced analysis of multi-labeled data, which are based on the role of a label against a semantic range.

Keywords: Classification Hierarchies, Data Analysis, Multilabeled Data, Orders of Sets of Labels

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1208