Search results for: maximal data sets
25342 Control the Flow of Big Data
Authors: Shizra Waris, Saleem Akhtar
Abstract:
Big data is a research area receiving attention from academia and IT communities. In the digital world, the amounts of data produced and stored have within a short period of time. Consequently this fast increasing rate of data has created many challenges. In this paper, we use functionalism and structuralism paradigms to analyze the genesis of big data applications and its current trends. This paper presents a complete discussion on state-of-the-art big data technologies based on group and stream data processing. Moreover, strengths and weaknesses of these technologies are analyzed. This study also covers big data analytics techniques, processing methods, some reported case studies from different vendor, several open research challenges and the chances brought about by big data. The similarities and differences of these techniques and technologies based on important limitations are also investigated. Emerging technologies are suggested as a solution for big data problems.Keywords: computer, it community, industry, big data
Procedia PDF Downloads 19425341 Using Seismic and GPS Data for Hazard Estimation in Some Active Regions in Egypt
Authors: Abdel-Monem Sayed Mohamed
Abstract:
Egypt rapidly growing development is accompanied by increasing levels of standard living particular in its urban areas. However, there is a limited experience in quantifying the sources of risk management in Egypt and in designing efficient strategies to keep away serious impacts of earthquakes. From the historical point of view and recent instrumental records, there are some seismo-active regions in Egypt, where some significant earthquakes had occurred in different places. The special tectonic features in Egypt: Aswan, Greater Cairo, Red Sea and Sinai Peninsula regions are the territories of a high seismic risk, which have to be monitored by up-to date technologies. The investigations of the seismic events and interpretations led to evaluate the seismic hazard for disaster prevention and for the safety of the dense populated regions and the vital national projects as the High Dam. In addition to the monitoring of the recent crustal movements, the most powerful technique of satellite geodesy GPS are used where geodetic networks are covering such seismo-active regions. The results from the data sets are compared and combined in order to determine the main characteristics of the deformation and hazard estimation for specified regions. The final compiled output from the seismological and geodetic analysis threw lights upon the geodynamical regime of these seismo-active regions and put Aswan and Greater Cairo under the lowest class according to horizontal crustal strains classifications. This work will serve a basis for the development of so-called catastrophic models and can be further used for catastrophic risk management. Also, this work is trying to evaluate risk of large catastrophic losses within the important regions including the High Dam, strategic buildings and archeological sites. Studies on possible scenarios of earthquakes and losses are a critical issue for decision making in insurance as a part of mitigation measures.Keywords: b-value, Gumbel distribution, seismic and GPS data, strain parameters
Procedia PDF Downloads 45925340 Model Averaging in a Multiplicative Heteroscedastic Model
Authors: Alan Wan
Abstract:
In recent years, the body of literature on frequentist model averaging in statistics has grown significantly. Most of this work focuses on models with different mean structures but leaves out the variance consideration. In this paper, we consider a regression model with multiplicative heteroscedasticity and develop a model averaging method that combines maximum likelihood estimators of unknown parameters in both the mean and variance functions of the model. Our weight choice criterion is based on a minimisation of a plug-in estimator of the model average estimator's squared prediction risk. We prove that the new estimator possesses an asymptotic optimality property. Our investigation of finite-sample performance by simulations demonstrates that the new estimator frequently exhibits very favourable properties compared to some existing heteroscedasticity-robust model average estimators. The model averaging method hedges against the selection of very bad models and serves as a remedy to variance function misspecification, which often discourages practitioners from modeling heteroscedasticity altogether. The proposed model average estimator is applied to the analysis of two real data sets.Keywords: heteroscedasticity-robust, model averaging, multiplicative heteroscedasticity, plug-in, squared prediction risk
Procedia PDF Downloads 38425339 High Performance Computing and Big Data Analytics
Authors: Branci Sarra, Branci Saadia
Abstract:
Because of the multiplied data growth, many computer science tools have been developed to process and analyze these Big Data. High-performance computing architectures have been designed to meet the treatment needs of Big Data (view transaction processing standpoint, strategic, and tactical analytics). The purpose of this article is to provide a historical and global perspective on the recent trend of high-performance computing architectures especially what has a relation with Analytics and Data Mining.Keywords: high performance computing, HPC, big data, data analysis
Procedia PDF Downloads 52025338 Pruning Algorithm for the Minimum Rule Reduct Generation
Authors: Sahin Emrah Amrahov, Fatih Aybar, Serhat Dogan
Abstract:
In this paper we consider the rule reduct generation problem. Rule Reduct Generation (RG) and Modified Rule Generation (MRG) algorithms, that are used to solve this problem, are well-known. Alternative to these algorithms, we develop Pruning Rule Generation (PRG) algorithm. We compare the PRG algorithm with RG and MRG.Keywords: rough sets, decision rules, rule induction, classification
Procedia PDF Downloads 52825337 Studying Second Language Learners' Language Behavior from Conversation Analysis Perspective
Authors: Yanyan Wang
Abstract:
This paper on second language teaching and learning uses conversation analysis (CA) approach and focuses on how second language learners of Chinese do repair when making clarification requests. In order to demonstrate their behavior in interaction, a comparison was made to study the differences between native speakers of Chinese with non-native speakers of Chinese. The significance of the research is to make second language teachers and learners aware of repair and how to seek clarification. Utilizing the methodology of CA, the research involved two sets of naturally occurring recordings, one of native speaker students and the other of non-native speaker students. Both sets of recording were telephone talks between students and teachers. There were 50 native speaker students and 50 non-native speaker students. From multiple listening to the recordings, the parts with repairs for clarification were selected for analysis which included the moments in the talk when students had problems in understanding or hearing the speaker and had to seek clarification. For example, ‘Sorry, I do not understand ‘and ‘Can you repeat the question? ‘were the parts as repair to make clarification requests. In the data, there were 43 such cases from native speaker students and 88 cases from non-native speaker students. The non-native speaker students were more likely to use repair to seek clarification. Analysis on how the students make clarification requests during their conversation was carried out by investigating how the students initiated problems and how the teachers repaired the problems. In CA term, it is called other-initiated self-repair (OISR), which refers to student-initiated teacher-repair in this research. The findings show that, in initiating repair, native speaker students pay more attention to mutual understanding (inter-subjectivity) while non-native speaker students, due to their lack of language proficiency, pay more attention to their status of knowledge (epistemic) switch. There are three major differences: 1, native Chinese students more often initiate closed-class OISR (seeking specific information in the request) such as repeating a word or phrases from the previous turn while non-native students more frequently initiate open-class OISR (not specifying clarification) such as ‘sorry, I don’t understand ‘. 2, native speakers’ clarification requests are treated by the teacher as understanding of the content while non-native learners’ clarification requests are treated by teacher as language proficiency problem. 3, native speakers don’t see repair as knowledge issue and there is no third position in the repair sequences to close repair while non-native learners take repair sequence as a time to adjust their knowledge. There is clear closing third position token such as ‘oh ‘ to close repair sequence so that the topic can go back. In conclusion, this paper uses conversation analysis approach to compare differences between native Chinese speakers and non-native Chinese learners in their ways of conducting repair when making clarification requests. The findings are useful in future Chinese language teaching and learning, especially in teaching pragmatics such as requests.Keywords: conversation analysis (CA), clarification request, second language (L2), teaching implication
Procedia PDF Downloads 25625336 A Landscape of Research Data Repositories in Re3data.org Registry: A Case Study of Indian Repositories
Authors: Prashant Shrivastava
Abstract:
The purpose of this study is to explore re3dat.org registry to identify research data repositories registration workflow process. Further objective is to depict a graph for present development of research data repositories in India. Preliminarily with an approach to understand re3data.org registry framework and schema design then further proceed to explore the status of research data repositories of India in re3data.org registry. Research data repositories are getting wider relevance due to e-research concepts. Now available registry re3data.org is a good tool for users and researchers to identify appropriate research data repositories as per their research requirements. In Indian environment, a compatible National Research Data Policy is the need of the time to boost the management of research data. Registry for Research Data Repositories is a crucial tool to discover specific information in specific domain. Also, Research Data Repositories in India have not been studied. Re3data.org registry and status of Indian research data repositories both discussed in this study.Keywords: research data, research data repositories, research data registry, re3data.org
Procedia PDF Downloads 32425335 Crime Prevention with Artificial Intelligence
Authors: Mehrnoosh Abouzari, Shahrokh Sahraei
Abstract:
Today, with the increase in quantity and quality and variety of crimes, the discussion of crime prevention has faced a serious challenge that human resources alone and with traditional methods will not be effective. One of the developments in the modern world is the presence of artificial intelligence in various fields, including criminal law. In fact, the use of artificial intelligence in criminal investigations and fighting crime is a necessity in today's world. The use of artificial intelligence is far beyond and even separate from other technologies in the struggle against crime. Second, its application in criminal science is different from the discussion of prevention and it comes to the prediction of crime. Crime prevention in terms of the three factors of the offender, the offender and the victim, following a change in the conditions of the three factors, based on the perception of the criminal being wise, and therefore increasing the cost and risk of crime for him in order to desist from delinquency or to make the victim aware of self-care and possibility of exposing him to danger or making it difficult to commit crimes. While the presence of artificial intelligence in the field of combating crime and social damage and dangers, like an all-seeing eye, regardless of time and place, it sees the future and predicts the occurrence of a possible crime, thus prevent the occurrence of crimes. The purpose of this article is to collect and analyze the studies conducted on the use of artificial intelligence in predicting and preventing crime. How capable is this technology in predicting crime and preventing it? The results have shown that the artificial intelligence technologies in use are capable of predicting and preventing crime and can find patterns in the data set. find large ones in a much more efficient way than humans. In crime prediction and prevention, the term artificial intelligence can be used to refer to the increasing use of technologies that apply algorithms to large sets of data to assist or replace police. The use of artificial intelligence in our debate is in predicting and preventing crime, including predicting the time and place of future criminal activities, effective identification of patterns and accurate prediction of future behavior through data mining, machine learning and deep learning, and data analysis, and also the use of neural networks. Because the knowledge of criminologists can provide insight into risk factors for criminal behavior, among other issues, computer scientists can match this knowledge with the datasets that artificial intelligence uses to inform them.Keywords: artificial intelligence, criminology, crime, prevention, prediction
Procedia PDF Downloads 7525334 A Study of Cloud Computing Solution for Transportation Big Data Processing
Authors: Ilgin Gökaşar, Saman Ghaffarian
Abstract:
The need for fast processed big data of transportation ridership (eg., smartcard data) and traffic operation (e.g., traffic detectors data) which requires a lot of computational power is incontrovertible in Intelligent Transportation Systems. Nowadays cloud computing is one of the important subjects and popular information technology solution for data processing. It enables users to process enormous measure of data without having their own particular computing power. Thus, it can also be a good selection for transportation big data processing as well. This paper intends to examine how the cloud computing can enhance transportation big data process with contrasting its advantages and disadvantages, and discussing cloud computing features.Keywords: big data, cloud computing, Intelligent Transportation Systems, ITS, traffic data processing
Procedia PDF Downloads 46725333 Parameter Identification Analysis in the Design of Rock Fill Dams
Authors: G. Shahzadi, A. Soulaimani
Abstract:
This research work aims to identify the physical parameters of the constitutive soil model in the design of a rockfill dam by inverse analysis. The best parameters of the constitutive soil model, are those that minimize the objective function, defined as the difference between the measured and numerical results. The Finite Element code (Plaxis) has been utilized for numerical simulation. Polynomial and neural network-based response surfaces have been generated to analyze the relationship between soil parameters and displacements. The performance of surrogate models has been analyzed and compared by evaluating the root mean square error. A comparative study has been done based on objective functions and optimization techniques. Objective functions are categorized by considering measured data with and without uncertainty in instruments, defined by the least square method, which estimates the norm between the predicted displacements and the measured values. Hydro Quebec provided data sets for the measured values of the Romaine-2 dam. Stochastic optimization, an approach that can overcome local minima, and solve non-convex and non-differentiable problems with ease, is used to obtain an optimum value. Genetic Algorithm (GA), Particle Swarm Optimization (PSO) and Differential Evolution (DE) are compared for the minimization problem, although all these techniques take time to converge to an optimum value; however, PSO provided the better convergence and best soil parameters. Overall, parameter identification analysis could be effectively used for the rockfill dam application and has the potential to become a valuable tool for geotechnical engineers for assessing dam performance and dam safety.Keywords: Rockfill dam, parameter identification, stochastic analysis, regression, PLAXIS
Procedia PDF Downloads 14625332 Harmonic Data Preparation for Clustering and Classification
Authors: Ali Asheibi
Abstract:
The rapid increase in the size of databases required to store power quality monitoring data has demanded new techniques for analysing and understanding the data. One suggested technique to assist in analysis is data mining. Preparing raw data to be ready for data mining exploration take up most of the effort and time spent in the whole data mining process. Clustering is an important technique in data mining and machine learning in which underlying and meaningful groups of data are discovered. Large amounts of harmonic data have been collected from an actual harmonic monitoring system in a distribution system in Australia for three years. This amount of acquired data makes it difficult to identify operational events that significantly impact the harmonics generated on the system. In this paper, harmonic data preparation processes to better understanding of the data have been presented. Underlying classes in this data has then been identified using clustering technique based on the Minimum Message Length (MML) method. The underlying operational information contained within the clusters can be rapidly visualised by the engineers. The C5.0 algorithm was used for classification and interpretation of the generated clusters.Keywords: data mining, harmonic data, clustering, classification
Procedia PDF Downloads 24725331 Developing a Deep Understanding of the Immune Response in Hepatitis B Virus Infected Patients Using a Knowledge Driven Approach
Authors: Hanan Begali, Shahi Dost, Annett Ziegler, Markus Cornberg, Maria-Esther Vidal, Anke R. M. Kraft
Abstract:
Chronic hepatitis B virus (HBV) infection can be treated with nucleot(s)ide analog (NA), for example, which inhibits HBV replication. However, they have hardly any influence on the functional cure of HBV, which is defined by hepatitis B surface antigen (HBsAg) loss. NA needs to be taken life-long, which is not available for all patients worldwide. Additionally, NA-treated patients are still at risk of developing cirrhosis, liver failure, or hepatocellular carcinoma (HCC). Although each patient has the same components of the immune system, immune responses vary between patients. Therefore, a deeper understanding of the immune response against HBV in different patients is necessary to understand the parameters leading to HBV cure and to use this knowledge to optimize HBV therapies. This requires seamless integration of an enormous amount of diverse and fine-grained data from viral markers, e.g., hepatitis B core-related antigen (HBcrAg) and hepatitis B surface antigen (HBsAg). The data integration system relies on the assumption that profiling human immune systems requires the analysis of various variables (e.g., demographic data, treatments, pre-existing conditions, immune cell response, or HLA-typing) rather than only one. However, the values of these variables are collected independently. They are presented in a myriad of formats, e.g., excel files, textual descriptions, lab book notes, and images of flow cytometry dot plots. Additionally, patients can be identified differently in these analyses. This heterogeneity complicates the integration of variables, as data management techniques are needed to create a unified view in which individual formats and identifiers are transparent when profiling the human immune systems. The proposed study (HBsRE) aims at integrating heterogeneous data sets of 87 chronically HBV-infected patients, e.g., clinical data, immune cell response, and HLA-typing, with knowledge encoded in biomedical ontologies and open-source databases into a knowledge-driven framework. This new technique enables us to harmonize and standardize heterogeneous datasets in the defined modeling of the data integration system, which will be evaluated in the knowledge graph (KG). KGs are data structures that represent the knowledge and data as factual statements using a graph data model. Finally, the analytic data model will be applied on top of KG in order to develop a deeper understanding of the immune profiles among various patients and to evaluate factors playing a role in a holistic profile of patients with HBsAg level loss. Additionally, our objective is to utilize this unified approach to stratify patients for new effective treatments. This study is developed in the context of the project “Transforming big data into knowledge: for deep immune profiling in vaccination, infectious diseases, and transplantation (ImProVIT)”, which is a multidisciplinary team composed of computer scientists, infection biologists, and immunologists.Keywords: chronic hepatitis B infection, immune response, knowledge graphs, ontology
Procedia PDF Downloads 10825330 Focus-Latent Dirichlet Allocation for Aspect-Level Opinion Mining
Authors: Mohsen Farhadloo, Majid Farhadloo
Abstract:
Aspect-level opinion mining that aims at discovering aspects (aspect identification) and their corresponding ratings (sentiment identification) from customer reviews have increasingly attracted attention of researchers and practitioners as it provides valuable insights about products/services from customer's points of view. Instead of addressing aspect identification and sentiment identification in two separate steps, it is possible to simultaneously identify both aspects and sentiments. In recent years many graphical models based on Latent Dirichlet Allocation (LDA) have been proposed to solve both aspect and sentiment identifications in a single step. Although LDA models have been effective tools for the statistical analysis of document collections, they also have shortcomings in addressing some unique characteristics of opinion mining. Our goal in this paper is to address one of the limitations of topic models to date; that is, they fail to directly model the associations among topics. Indeed in many text corpora, it is natural to expect that subsets of the latent topics have higher probabilities. We propose a probabilistic graphical model called focus-LDA, to better capture the associations among topics when applied to aspect-level opinion mining. Our experiments on real-life data sets demonstrate the improved effectiveness of the focus-LDA model in terms of the accuracy of the predictive distributions over held out documents. Furthermore, we demonstrate qualitatively that the focus-LDA topic model provides a natural way of visualizing and exploring unstructured collection of textual data.Keywords: aspect-level opinion mining, document modeling, Latent Dirichlet Allocation, LDA, sentiment analysis
Procedia PDF Downloads 9425329 Proposal of Data Collection from Probes
Authors: M. Kebisek, L. Spendla, M. Kopcek, T. Skulavik
Abstract:
In our paper we describe the security capabilities of data collection. Data are collected with probes located in the near and distant surroundings of the company. Considering the numerous obstacles e.g. forests, hills, urban areas, the data collection is realized in several ways. The collection of data uses connection via wireless communication, LAN network, GSM network and in certain areas data are collected by using vehicles. In order to ensure the connection to the server most of the probes have ability to communicate in several ways. Collected data are archived and subsequently used in supervisory applications. To ensure the collection of the required data, it is necessary to propose algorithms that will allow the probes to select suitable communication channel.Keywords: communication, computer network, data collection, probe
Procedia PDF Downloads 36025328 Calculation of Electronic Structures of Nickel in Interaction with Hydrogen by Density Functional Theoretical (DFT) Method
Authors: Choukri Lekbir, Mira Mokhtari
Abstract:
Hydrogen-Materials interaction and mechanisms can be modeled at nano scale by quantum methods. In this work, the effect of hydrogen on the electronic properties of a cluster material model «nickel» has been studied by using of density functional theoretical (DFT) method. Two types of clusters are optimized: Nickel and hydrogen-nickel system. In the case of nickel clusters (n = 1-6) without presence of hydrogen, three types of electronic structures (neutral, cationic and anionic), have been optimized according to three basis sets calculations (B3LYP/LANL2DZ, PW91PW91/DGDZVP2, PBE/DGDZVP2). The comparison of binding energies and bond lengths of the three structures of nickel clusters (neutral, cationic and anionic) obtained by those basis sets, shows that the results of neutral and anionic nickel clusters are in good agreement with the experimental results. In the case of neutral and anionic nickel clusters, comparing energies and bond lengths obtained by the three bases, shows that the basis set PBE/DGDZVP2 is most suitable to experimental results. In the case of anionic nickel clusters (n = 1-6) with presence of hydrogen, the optimization of the hydrogen-nickel (anionic) structures by using of the basis set PBE/DGDZVP2, shows that the binding energies and bond lengths increase compared to those obtained in the case of anionic nickel clusters without the presence of hydrogen, that reveals the armor effect exerted by hydrogen on the electronic structure of nickel, which due to the storing of hydrogen energy within nickel clusters structures. The comparison between the bond lengths for both clusters shows the expansion effect of clusters geometry which due to hydrogen presence.Keywords: binding energies, bond lengths, density functional theoretical, geometry optimization, hydrogen energy, nickel cluster
Procedia PDF Downloads 42225327 Tectogenesis Around Kalaat Es Senan, Northwest of Tunisia: Structural, Geophysical and Gravimetric Study
Authors: Amira Rjiba, Mohamed Ghanmi, Tahar Aifa, Achref Boulares
Abstract:
This study, involving the interpretation of geological outcrops data (structures, and lithostratigraphiec colones) and subsurface structures (seismic and gravimetric data) help us to identify and precise (i) the lithology of the sedimentary formations between the Aptian and the recent formations, (ii) to differentiate the sedimentary formations it from the salt-bearing Triassic (iii) and to specify the major structures though the tectonics effects having affected the region during its geological evolution. By placing our study area placed in the context of Tunisia, located on the southern margin of the Tethys show us through tectonic traces and structural analysis conducted, that this area was submitted during the Triassic perio at an active rifting triggered extensional tectonic events and extensive respectively in the Cretaceous and Paleogene. Lithostratigraphic correlations between outcrops and seismic data sets on those of six oil wells conducted in the region have allowed us to better understand the structural complexity and the role of different tectonic faults having contributed to the current configuration, and marked by the current rifts. Indeed, three directions of NW-SE faults, NNW-SSE to NS and NE-SW to EW had a major role in the genesis of folds and open ditches collapse of NW-SE direction. These results were complemented by seismic reflection data to clarify the geometry of the southern and western areas of Kalaa Khasba ditch. The eight selected seismic lines for this study allowed to characterize the main structures, with isochronous maps, contour and isovitesse of Serdj horizon that presents the main reservoir in the region. The line L2, keyed by the well 6, helped highlight the NW-SE compression that has resulted in persistent discrepancies widely identifiable in its lithostratigraphic column. The gravity survey has confirmed the extension of most of the accidents deep subsurface whose activity seems to go far. Gravimetry also reinforced seismic interpretation confirming, at the L2 well, that both SW and NE flank of the moat are two opposite faults and trace the boundaries of NNW-SSE direction graben whose sedimentation of Mio-Pliocene age and Quaternary.Keywords: graben, graben collapse, gravity, Kalat Es Senan, seismic, tectogenesis
Procedia PDF Downloads 36725326 A Review on Big Data Movement with Different Approaches
Authors: Nay Myo Sandar
Abstract:
With the growth of technologies and applications, a large amount of data has been producing at increasing rate from various resources such as social media networks, sensor devices, and other information serving devices. This large collection of massive, complex and exponential growth of dataset is called big data. The traditional database systems cannot store and process such data due to large and complexity. Consequently, cloud computing is a potential solution for data storage and processing since it can provide a pool of resources for servers and storage. However, moving large amount of data to and from is a challenging issue since it can encounter a high latency due to large data size. With respect to big data movement problem, this paper reviews the literature of previous works, discusses about research issues, finds out approaches for dealing with big data movement problem.Keywords: Big Data, Cloud Computing, Big Data Movement, Network Techniques
Procedia PDF Downloads 8625325 Exploring Polar Syntactic Effects of Verbal Extensions in Basà Language
Authors: Imoh Philip
Abstract:
This work investigates four verbal extensions; two in each set resulting in two opposite effects of the valency of verbs in Basà language. Basà language is an indigenous language spoken in Kogi, Nasarawa, Benue, Niger states and all the Federal Capital Territory (FCT) councils. Crozier & Blench (1992) and Blench & Williamson (1988) classify Basà as belonging to Proto–Kru, under the sub-phylum Western –Kru. It studies the effects of such morphosyntactic operations in Basà language with special focus on ‘reflexives’ ‘reciprocals’ versus ‘causativization’ and ‘applicativization’ both sets are characterized by polar syntactic processes of either decreasing or increasing the verb’s valency by one argument vis-à-vis the basic number of arguments, but by the similar morphological processes. In addition to my native intuitions as a native speaker of Basà language, data elicited for this work include discourse observation, staged and elicited spoken data from fluent native speakers. The paper argues that affixes attached to the verb root, result in either deriving an intransitive verb from a transitive one or a transitive verb from a bi/ditransitive verb and equally increase the verb’s valence deriving either a bitransitive verb from a transitive verb or a transitive verb from a intransitive one. Where the operation increases the verb’s valency, it triggers a transformation of arguments in the derived structure. In this case, the applied arguments displace the inherent ones. This investigation can stimulate further study on other transformations that are either syntactic or morphosyntactic in Basà and can also be replicated in other African and non-African languages.Keywords: verbal extension, valency, reflexive, reciprocal, causativization, applicativization, Basà
Procedia PDF Downloads 20125324 Optimized Approach for Secure Data Sharing in Distributed Database
Authors: Ahmed Mateen, Zhu Qingsheng, Ahmad Bilal
Abstract:
In the current age of technology, information is the most precious asset of a company. Today, companies have a large amount of data. As the data become larger, access to data for some particular information is becoming slower day by day. Faster data processing to shape it in the form of information is the biggest issue. The major problems in distributed databases are the efficiency of data distribution and response time of data distribution. The security of data distribution is also a big issue. For these problems, we proposed a strategy that can maximize the efficiency of data distribution and also increase its response time. This technique gives better results for secure data distribution from multiple heterogeneous sources. The newly proposed technique facilitates the companies for secure data sharing efficiently and quickly.Keywords: ER-schema, electronic record, P2P framework, API, query formulation
Procedia PDF Downloads 33325323 Combined Automatic Speech Recognition and Machine Translation in Business Correspondence Domain for English-Croatian
Authors: Sanja Seljan, Ivan Dunđer
Abstract:
The paper presents combined automatic speech recognition (ASR) for English and machine translation (MT) for English and Croatian in the domain of business correspondence. The first part presents results of training the ASR commercial system on two English data sets, enriched by error analysis. The second part presents results of machine translation performed by online tool Google Translate for English and Croatian and Croatian-English language pairs. Human evaluation in terms of usability is conducted and internal consistency calculated by Cronbach's alpha coefficient, enriched by error analysis. Automatic evaluation is performed by WER (Word Error Rate) and PER (Position-independent word Error Rate) metrics, followed by investigation of Pearson’s correlation with human evaluation.Keywords: automatic machine translation, integrated language technologies, quality evaluation, speech recognition
Procedia PDF Downloads 48425322 The Architecture, Engineering and Construction(AEC)New Paradigm Shift: Building Information Modelling Trend in the United Arab Emirates
Authors: Salem B. Abdalla
Abstract:
This study investigated the current Building Information Modelling (BIM) trends and practices in the UAE, particularly to shed light on a recently circulated Dubai BIM mandate. Two sets of surveys were mailed to the AEC industry and the corresponding academic sector within the UAE to collect up-to-date data on BIM awareness and utilization. The surveys showed startling results concerning the academic sector in the UAE where almost 70% of respondents were not aware of the BIM mandate. Among the rest, even when aware, the majority of mechanical and electrical engineering schools felt that BIM is not pertinent to their discipline. Therefore, the response to offering BIM in their curriculum was substantially low (35%). On the other hand, the industrial survey identified a large majority (76.5%) of the AEC industry in the UAE are using BIM. The results clearly indicate that the academia should include BIM in their curriculum to produce qualified graduates to support the market. However, the academia is also faced with several obstacles to implement BIM in their curriculum, where the main pretext is that there is “no room for new courses in existing curriculum”.Keywords: building information modeling, BIM adoption, UAE BIM industry survey, UAE BIM academia survey, Dubai BIM mandate, UK BIM mandate, BIM education, architecture education, engineering schools, BIM implementation, BIM curriculum
Procedia PDF Downloads 41425321 Clustering Ethno-Informatics of Naming Village in Java Island Using Data Mining
Authors: Atje Setiawan Abdullah, Budi Nurani Ruchjana, I. Gede Nyoman Mindra Jaya, Eddy Hermawan
Abstract:
Ethnoscience is used to see the culture with a scientific perspective, which may help to understand how people develop various forms of knowledge and belief, initially focusing on the ecology and history of the contributions that have been there. One of the areas studied in ethnoscience is etno-informatics, is the application of informatics in the culture. In this study the science of informatics used is data mining, a process to automatically extract knowledge from large databases, to obtain interesting patterns in order to obtain a knowledge. While the application of culture described by naming database village on the island of Java were obtained from Geographic Indonesia Information Agency (BIG), 2014. The purpose of this study is; first, to classify the naming of the village on the island of Java based on the structure of the word naming the village, including the prefix of the word, syllable contained, and complete word. Second to classify the meaning of naming the village based on specific categories, as well as its role in the community behavioral characteristics. Third, how to visualize the naming of the village to a map location, to see the similarity of naming villages in each province. In this research we have developed two theorems, i.e theorems area as a result of research studies have collected intersection naming villages in each province on the island of Java, and the composition of the wedge theorem sets the provinces in Java is used to view the peculiarities of a location study. The methodology in this study base on the method of Knowledge Discovery in Database (KDD) on data mining, the process includes preprocessing, data mining and post processing. The results showed that the Java community prioritizes merit in running his life, always working hard to achieve a more prosperous life, and love as well as water and environmental sustainment. Naming villages in each location adjacent province has a high degree of similarity, and influence each other. Cultural similarities in the province of Central Java, East Java and West Java-Banten have a high similarity, whereas in Jakarta-Yogyakarta has a low similarity. This research resulted in the cultural character of communities within the meaning of the naming of the village on the island of Java, this character is expected to serve as a guide in the behavior of people's daily life on the island of Java.Keywords: ethnoscience, ethno-informatics, data mining, clustering, Java island culture
Procedia PDF Downloads 28325320 Copper Price Prediction Model for Various Economic Situations
Authors: Haidy S. Ghali, Engy Serag, A. Samer Ezeldin
Abstract:
Copper is an essential raw material used in the construction industry. During the year 2021 and the first half of 2022, the global market suffered from a significant fluctuation in copper raw material prices due to the aftermath of both the COVID-19 pandemic and the Russia-Ukraine war, which exposed its consumers to an unexpected financial risk. Thereto, this paper aims to develop two ANN-LSTM price prediction models, using Python, that can forecast the average monthly copper prices traded in the London Metal Exchange; the first model is a multivariate model that forecasts the copper price of the next 1-month and the second is a univariate model that predicts the copper prices of the upcoming three months. Historical data of average monthly London Metal Exchange copper prices are collected from January 2009 till July 2022, and potential external factors are identified and employed in the multivariate model. These factors lie under three main categories: energy prices and economic indicators of the three major exporting countries of copper, depending on the data availability. Before developing the LSTM models, the collected external parameters are analyzed with respect to the copper prices using correlation and multicollinearity tests in R software; then, the parameters are further screened to select the parameters that influence the copper prices. Then, the two LSTM models are developed, and the dataset is divided into training, validation, and testing sets. The results show that the performance of the 3-Month prediction model is better than the 1-Month prediction model, but still, both models can act as predicting tools for diverse economic situations.Keywords: copper prices, prediction model, neural network, time series forecasting
Procedia PDF Downloads 11325319 Artificial Intelligence for Traffic Signal Control and Data Collection
Authors: Reggie Chandra
Abstract:
Trafficaccidents and traffic signal optimization are correlated. However, 70-90% of the traffic signals across the USA are not synchronized. The reason behind that is insufficient resources to create and implement timing plans. In this work, we will discuss the use of a breakthrough Artificial Intelligence (AI) technology to optimize traffic flow and collect 24/7/365 accurate traffic data using a vehicle detection system. We will discuss what are recent advances in Artificial Intelligence technology, how does AI work in vehicles, pedestrians, and bike data collection, creating timing plans, and what is the best workflow for that. Apart from that, this paper will showcase how Artificial Intelligence makes signal timing affordable. We will introduce a technology that uses Convolutional Neural Networks (CNN) and deep learning algorithms to detect, collect data, develop timing plans and deploy them in the field. Convolutional Neural Networks are a class of deep learning networks inspired by the biological processes in the visual cortex. A neural net is modeled after the human brain. It consists of millions of densely connected processing nodes. It is a form of machine learning where the neural net learns to recognize vehicles through training - which is called Deep Learning. The well-trained algorithm overcomes most of the issues faced by other detection methods and provides nearly 100% traffic data accuracy. Through this continuous learning-based method, we can constantly update traffic patterns, generate an unlimited number of timing plans and thus improve vehicle flow. Convolutional Neural Networks not only outperform other detection algorithms but also, in cases such as classifying objects into fine-grained categories, outperform humans. Safety is of primary importance to traffic professionals, but they don't have the studies or data to support their decisions. Currently, one-third of transportation agencies do not collect pedestrian and bike data. We will discuss how the use of Artificial Intelligence for data collection can help reduce pedestrian fatalities and enhance the safety of all vulnerable road users. Moreover, it provides traffic engineers with tools that allow them to unleash their potential, instead of dealing with constant complaints, a snapshot of limited handpicked data, dealing with multiple systems requiring additional work for adaptation. The methodologies used and proposed in the research contain a camera model identification method based on deep Convolutional Neural Networks. The proposed application was evaluated on our data sets acquired through a variety of daily real-world road conditions and compared with the performance of the commonly used methods requiring data collection by counting, evaluating, and adapting it, and running it through well-established algorithms, and then deploying it to the field. This work explores themes such as how technologies powered by Artificial Intelligence can benefit your community and how to translate the complex and often overwhelming benefits into a language accessible to elected officials, community leaders, and the public. Exploring such topics empowers citizens with insider knowledge about the potential of better traffic technology to save lives and improve communities. The synergies that Artificial Intelligence brings to traffic signal control and data collection are unsurpassed.Keywords: artificial intelligence, convolutional neural networks, data collection, signal control, traffic signal
Procedia PDF Downloads 16925318 Chinese Sentence Level Lip Recognition
Authors: Peng Wang, Tigang Jiang
Abstract:
The computer based lip reading method of different languages cannot be universal. At present, for the research of Chinese lip reading, whether the work on data sets or recognition algorithms, is far from mature. In this paper, we study the Chinese lipreading method based on machine learning, and propose a Chinese Sentence-level lip-reading network (CNLipNet) model which consists of spatio-temporal convolutional neural network(CNN), recurrent neural network(RNN) and Connectionist Temporal Classification (CTC) loss function. This model can map variable-length sequence of video frames to Chinese Pinyin sequence and is trained end-to-end. More over, We create CNLRS, a Chinese Lipreading Dataset, which contains 5948 samples and can be shared through github. The evaluation of CNLipNet on this dataset yielded a 41% word correct rate and a 70.6% character correct rate. This evaluation result is far superior to the professional human lip readers, indicating that CNLipNet performs well in lipreading.Keywords: lipreading, machine learning, spatio-temporal, convolutional neural network, recurrent neural network
Procedia PDF Downloads 12825317 Study of Two Adsorbent-Refrigerant Pairs for the Application of Solar-Powered Adsorption Refrigeration System
Authors: Mohammed Ali Hadj Ammar, Fethi Bouras, Kamel Sahlaoui
Abstract:
This article presents a detailed study of two working pairs intended for use in solar adsorption refrigeration (SAR) system. The study was based on two indicators: the daily production and coefficient of performance (COP). The thermodynamic cycle of the system is based on the adsorption phenomena at a constant temperature. A computer simulation program has been developed for modeling and performance evaluation for the solar-powered adsorption refrigeration cycle. It was found that maximal cycled mass is obtained by S40/water (0.280kg/kg) followed by CarboTech C40/1/methanol (0.260kg/kg). At a condenser temperature of 30°C, with an adsorbent mass of 38.59 kg, and an integrated collector/bed configuration, the couple CarboTech C40/1/methanol for the ice-maker purpose can reach cycle COP of 0.63 and can produce about 13.6kg ice per day, while the couple S40/water for the air-conditioning can reach cycle COP of 0.66 and 212kg as daily cold-water production. Additionally, adequate indicators are evaluated addressing the economic and environmental associated with each working pair.Keywords: solar adsorption, refrigeration, activated carbon, silica gel
Procedia PDF Downloads 13125316 A Fact-Finding Analysis on the Expulsions Made under Title 42 in Us
Authors: Avi Shrivastava
Abstract:
Title 42, an emergency health decree, has forced the federal authorities to turn away asylum seekers and all other border crossers since last year. When Title 42 was first deployed in immigration detention centers, where many migrants are held when they arrive at the U.S.-Mexico border, the Trump administration embraced it as a strategy. Expulsions Policy and New Border Challenges will be examined in regard to Title 42 concerns. Humanitarian measures for refugees arriving at the US-Mexico border are the focus of this article. To a large extent, this article addresses the implications of the United States' use of Title 42 in expelling refugees and the possible ramifications of doing away with it. A secondary data collecting strategy was used to gather the information for this study, allowing researchers to examine a large number of previously collected data sets. Information about Title 42 may be found in a variety of places, such as scholarly publications, newspapers, books, and the internet. The inquiry employed qualitative and explanatory research approaches. The claim that 1.7 million individuals were forced to leave the country as a result of it was withdrawn. Since CBP and ICE were limited in their ability to process deportees, it employed a very random patchwork technique in selecting the expelled individuals. As a consequence, repeat offenders, particularly those who were single, got a reduced punishment. The government will be compelled to focus on long-overdue but vital border enhancements if expulsions are halted. Title 42 provisions may help expedite the processing of asylum and other types of humanitarian relief. The government is prepared for an increase in arrivals, but ending the program would lead to a return to arrival levels seen during the Title 42 period.Keywords: migrants, refugees, title 42, medical, trump administration
Procedia PDF Downloads 8725315 Data Mining Algorithms Analysis: Case Study of Price Predictions of Lands
Authors: Julio Albuja, David Zaldumbide
Abstract:
Data analysis is an important step before taking a decision about money. The aim of this work is to analyze the factors that influence the final price of the houses through data mining algorithms. To our best knowledge, previous work was researched just to compare results. Furthermore, before using the data of the data set, the Z-Transformation were used to standardize the data in the same range. Hence, the data was classified into two groups to visualize them in a readability format. A decision tree was built, and graphical data is displayed where clearly is easy to see the results and the factors' influence in these graphics. The definitions of these methods are described, as well as the descriptions of the results. Finally, conclusions and recommendations are presented related to the released results that our research showed making it easier to apply these algorithms using a customized data set.Keywords: algorithms, data, decision tree, transformation
Procedia PDF Downloads 37425314 The role of Financial Development and Institutional Quality in Promoting Sustainable Development through Tourism Management
Authors: Hashim Zameer
Abstract:
Effective tourism management plays a vital role in promoting sustainability and supporting ecosystems. A common principle that has been in practice over the years is “first pollute and then clean,” indicating countries need financial resources to promote sustainability. Financial development and the tourism management both seems very important to promoting sustainable development. However, without institutional support, it is very difficult to succeed. In this context, it seems prominently significant to explore how institutional quality, tourism development, and financial development could promote sustainable development. In the past, no research explored the role of tourism development in sustainable development. Moreover, the role of financial development, natural resources, and institutional quality in sustainable development is also ignored. In this regard, this paper aims to investigate the role of tourism development, natural resources, financial development, and institutional quality in sustainable development in China. The study used time-series data from 2000–2021 and employed the Bayesian linear regression model because it is suitable for small data sets. The robustness of the findings was checked using a quantile regression approach. The results reveal that an increase in tourism expenditures stimulates the economy, creates jobs, encourages cultural exchange, and supports sustainability initiatives. Moreover, financial development and institution quality have a positive effect on sustainable development. However, reliance on natural resources can result in negative economic, social, and environmental outcomes, highlighting the need for resource diversification and management to reinforce sustainable development. These results highlight the significance of financial development, strong institutions, sustainable tourism, and careful utilization of natural resources for long-term sustainability. The study holds vital insights for policy formulation to promote sustainable tourism.Keywords: sustainability, tourism development, financial development, institutional quality
Procedia PDF Downloads 8125313 The Critical Relevance of Credit and Debt Data in Household Food Security Analysis: The Risks of Ineffective Response Actions
Authors: Siddharth Krishnaswamy
Abstract:
Problem Statement: Currently, when analyzing household food security, the most commonly studied food access indicators are household income and expenditure. Larger studies do take into account other indices such as credit and employment. But these are baselines studies and by definition are conducted infrequently. Food security analysis for access is usually dedicated to analyzing income and expenditure indicators. And both these indicators are notoriously inconsistent. Yet this data can very often end up being the basis on which household food access is calculated; and by extension, be used for decision making. Objectives: This paper argues that along with income and expenditure, credit and debit information should be collected so that an accurate analysis of household food security (and in particular) food access can be determined. The lack of collection and analysis of this information routinely means that there is often a “masking” of the actual situation; a household’s food access and food availability patterns may be adequate mainly as a result of borrowing and may even be due to a long- term dependency (a debt cycle). In other words, such a household is, in reality, worse off than it appears a factor masked by its performance on basic access indicators. Procedures/methodologies/approaches: Existing food security data sets collected in 2005 in Azerbaijan, 2010 across Myanmar and 2014-15 across Uganda were used to support the theory that analyzing income and expenditure of a HHs and analyzing the same in addition to data on credit & borrowing patterns will result in an entirely different scenario of food access of the household. Furthermore, the data analyzed depicts food consumption patterns across groups of households and then relates this to the extent of dependency on credit, i.e. households borrowing money in order to meet food needs. Finally, response options that were based on analyzing only income and expenditure; and response options based on income, expenditure, credit, and borrowing – from the same geographical area of operation are studied and discussed. Results: The purpose of this work was to see if existing methods of household food security analysis could be improved. It is hoped that food security analysts will collect household level information on credit and debit and analyze them against income, expenditure and consumption patterns. This will help determine if a household’s food access and availability are dependent on unsustainable strategies such as borrowing money for food or undertaking sustained debts. Conclusions: The results clearly show the amount of relevant information that is missing in Food Access analysis if debit and borrowing of the household is not analyzed along with the typical Food Access indicators that are usually analyzed. And the serious repercussions this has on Programmatic response and interventions.Keywords: analysis, food security indicators, response, resilience analysis
Procedia PDF Downloads 331