Search results for: small data sets
Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 27988

Search results for: small data sets

27958 A Fuzzy Nonlinear Regression Model for Interval Type-2 Fuzzy Sets

Authors: O. Poleshchuk, E. Komarov

Abstract:

This paper presents a regression model for interval type-2 fuzzy sets based on the least squares estimation technique. Unknown coefficients are assumed to be triangular fuzzy numbers. The basic idea is to determine aggregation intervals for type-1 fuzzy sets, membership functions of whose are low membership function and upper membership function of interval type-2 fuzzy set. These aggregation intervals were called weighted intervals. Low and upper membership functions of input and output interval type-2 fuzzy sets for developed regression models are considered as piecewise linear functions.

Keywords: interval type-2 fuzzy sets, fuzzy regression, weighted interval

Procedia PDF Downloads 335
27957 A Novel Heuristic for Analysis of Large Datasets by Selecting Wrapper-Based Features

Authors: Bushra Zafar, Usman Qamar

Abstract:

Large data sample size and dimensions render the effectiveness of conventional data mining methodologies. A data mining technique are important tools for collection of knowledgeable information from variety of databases and provides supervised learning in the form of classification to design models to describe vital data classes while structure of the classifier is based on class attribute. Classification efficiency and accuracy are often influenced to great extent by noisy and undesirable features in real application data sets. The inherent natures of data set greatly masks its quality analysis and leave us with quite few practical approaches to use. To our knowledge first time, we present a new approach for investigation of structure and quality of datasets by providing a targeted analysis of localization of noisy and irrelevant features of data sets. Machine learning is based primarily on feature selection as pre-processing step which offers us to select few features from number of features as a subset by reducing the space according to certain evaluation criterion. The primary objective of this study is to trim down the scope of the given data sample by searching a small set of important features which may results into good classification performance. For this purpose, a heuristic for wrapper-based feature selection using genetic algorithm and for discriminative feature selection an external classifier are used. Selection of feature based on its number of occurrence in the chosen chromosomes. Sample dataset has been used to demonstrate proposed idea effectively. A proposed method has improved average accuracy of different datasets is about 95%. Experimental results illustrate that proposed algorithm increases the accuracy of prediction of different diseases.

Keywords: data mining, generic algorithm, KNN algorithms, wrapper based feature selection

Procedia PDF Downloads 293
27956 The Impact of Sport Tourism on Small Scale Business Development in Sri Lanka

Authors: Vimuckthi Charika Wickramaratne, Prasansha Kumari

Abstract:

Sport tourism refers to travel which involves either observing or participating in a sporting event apart from their usual environment. Sport tourism in a fast growing sector of the Sri Lankan travelling industry since Cricket are more popular sport game in the country. This study intends to analyze the impact of these popular sport events for creating and developing small scale business in the country. Primary data gathered from 100 small entrepreneurs around Keththarama Cricket Ground in Sri Lanka. Collected data analyzed using descriptive research methods. The study revealed that local and international visitors for cricket games had impacted on small scale business activities such as retail, handicraft, transport, vehicle parking, small restaurant, hotels, foods and beverage industry. In addition, it was identified that these type of small business are sessional income generating activities for the short period.

Keywords: sport tourism, small scale business, cricket, entrepreneurs

Procedia PDF Downloads 263
27955 Effects of Employees’ Training Program on the Performance of Small Scale Enterprises in Oyo State

Authors: Itiola Kehinde Adeniran

Abstract:

The study examined the effect of employees’ training on the performance of small scale enterprises in Oyo State. A structured questionnaire was used to collect data from 150 respondents through purposive sampling method. Linear regression was used with the aid of statistical package for social science (SPSS) version 20 to analyze the data collected in order to examine the effect of independent variable, employees’ training on dependent variable, performance (profit) of small scale enterprises. The result revealed that employees’ training has a significant effect on the performance of small scale enterprises. It was concluded that predictor variable namely (training) is 55.5% variance of enterprises performance (profitability). Therefore, the paper recommended that all small scale enterprises in Nigeria should embrace manpower training and development in order to improve employees’ performance leading to organizational profitability.

Keywords: training, employee performance, small scale enterprise, organizational profitability

Procedia PDF Downloads 338
27954 Minimizing Mutant Sets by Equivalence and Subsumption

Authors: Samia Alblwi, Amani Ayad

Abstract:

Mutation testing is the art of generating syntactic variations of a base program and checking whether a candidate test suite can identify all the mutants that are not semantically equivalent to the base: this technique is widely used by researchers to select quality test suites. One of the main obstacles to the widespread use of mutation testing is cost: even small pro-grams (a few dozen lines of code) can give rise to a large number of mutants (up to hundreds): this has created an incentive to seek to reduce the number of mutants while preserving their collective effectiveness. Two criteria have been used to reduce the size of mutant sets: equiva-lence, which aims to partition the set of mutants into equivalence classes modulo semantic equivalence, and selecting one representative per class; subsumption, which aims to define a partial ordering among mutants that ranks mutants by effectiveness and seeks to select maximal elements in this ordering. In this paper we analyze these two policies using analytical and em-pirical criteria.

Keywords: mutation testing, mutant sets, mutant equivalence, mutant subsumption, mutant set minimization

Procedia PDF Downloads 32
27953 Linguistic Summarization of Structured Patent Data

Authors: E. Y. Igde, S. Aydogan, F. E. Boran, D. Akay

Abstract:

Patent data have an increasingly important role in economic growth, innovation, technical advantages and business strategies and even in countries competitions. Analyzing of patent data is crucial since patents cover large part of all technological information of the world. In this paper, we have used the linguistic summarization technique to prove the validity of the hypotheses related to patent data stated in the literature.

Keywords: data mining, fuzzy sets, linguistic summarization, patent data

Procedia PDF Downloads 245
27952 Mining Big Data in Telecommunications Industry: Challenges, Techniques, and Revenue Opportunity

Authors: Hoda A. Abdel Hafez

Abstract:

Mining big data represents a big challenge nowadays. Many types of research are concerned with mining massive amounts of data and big data streams. Mining big data faces a lot of challenges including scalability, speed, heterogeneity, accuracy, provenance and privacy. In telecommunication industry, mining big data is like a mining for gold; it represents a big opportunity and maximizing the revenue streams in this industry. This paper discusses the characteristics of big data (volume, variety, velocity and veracity), data mining techniques and tools for handling very large data sets, mining big data in telecommunication and the benefits and opportunities gained from them.

Keywords: mining big data, big data, machine learning, telecommunication

Procedia PDF Downloads 369
27951 A Relative Entropy Regularization Approach for Fuzzy C-Means Clustering Problem

Authors: Ouafa Amira, Jiangshe Zhang

Abstract:

Clustering is an unsupervised machine learning technique; its aim is to extract the data structures, in which similar data objects are grouped in the same cluster, whereas dissimilar objects are grouped in different clusters. Clustering methods are widely utilized in different fields, such as: image processing, computer vision , and pattern recognition, etc. Fuzzy c-means clustering (fcm) is one of the most well known fuzzy clustering methods. It is based on solving an optimization problem, in which a minimization of a given cost function has been studied. This minimization aims to decrease the dissimilarity inside clusters, where the dissimilarity here is measured by the distances between data objects and cluster centers. The degree of belonging of a data point in a cluster is measured by a membership function which is included in the interval [0, 1]. In fcm clustering, the membership degree is constrained with the condition that the sum of a data object’s memberships in all clusters must be equal to one. This constraint can cause several problems, specially when our data objects are included in a noisy space. Regularization approach took a part in fuzzy c-means clustering technique. This process introduces an additional information in order to solve an ill-posed optimization problem. In this study, we focus on regularization by relative entropy approach, where in our optimization problem we aim to minimize the dissimilarity inside clusters. Finding an appropriate membership degree to each data object is our objective, because an appropriate membership degree leads to an accurate clustering result. Our clustering results in synthetic data sets, gaussian based data sets, and real world data sets show that our proposed model achieves a good accuracy.

Keywords: clustering, fuzzy c-means, regularization, relative entropy

Procedia PDF Downloads 240
27950 Small Micro and Medium Enterprises Perception-Based Framework to Access Financial Support

Authors: Melvin Mothoa

Abstract:

Small Micro and Medium Enterprises are very significant for the development of their market economies. They are the main creators of the new working places, and they present a vital core of the market economy in countries across the globe. Access to finance is identified as crucial for small, micro, and medium-sized enterprises for their growth and innovation. This paper is conceived to propose a perception-based SMME framework to aid in access to financial support. Furthermore, the study will address issues that impede SMMEs in South Africa from obtaining finance from financial institutions. The framework will be tested against data collected from 200 Small Micro & Medium Enterprises in the Gauteng province of South Africa. The study adopts a quantitative method, and the delivery of self-administered questionnaires to SMMEs will be the primary data collection tool. Structural equation modeling will be used to further analyse the data collected.

Keywords: finance, small business, growth, development

Procedia PDF Downloads 70
27949 Data Augmentation for Automatic Graphical User Interface Generation Based on Generative Adversarial Network

Authors: Xulu Yao, Moi Hoon Yap, Yanlong Zhang

Abstract:

As a branch of artificial neural network, deep learning is widely used in the field of image recognition, but the lack of its dataset leads to imperfect model learning. By analysing the data scale requirements of deep learning and aiming at the application in GUI generation, it is found that the collection of GUI dataset is a time-consuming and labor-consuming project, which is difficult to meet the needs of current deep learning network. To solve this problem, this paper proposes a semi-supervised deep learning model that relies on the original small-scale datasets to produce a large number of reliable data sets. By combining the cyclic neural network with the generated countermeasure network, the cyclic neural network can learn the sequence relationship and characteristics of data, make the generated countermeasure network generate reasonable data, and then expand the Rico dataset. Relying on the network structure, the characteristics of collected data can be well analysed, and a large number of reasonable data can be generated according to these characteristics. After data processing, a reliable dataset for model training can be formed, which alleviates the problem of dataset shortage in deep learning.

Keywords: GUI, deep learning, GAN, data augmentation

Procedia PDF Downloads 148
27948 Cloud-Based Multiresolution Geodata Cube for Efficient Raster Data Visualization and Analysis

Authors: Lassi Lehto, Jaakko Kahkonen, Juha Oksanen, Tapani Sarjakoski

Abstract:

The use of raster-formatted data sets in geospatial analysis is increasing rapidly. At the same time, geographic data are being introduced into disciplines outside the traditional domain of geoinformatics, like climate change, intelligent transport, and immigration studies. These developments call for better methods to deliver raster geodata in an efficient and easy-to-use manner. Data cube technologies have traditionally been used in the geospatial domain for managing Earth Observation data sets that have strict requirements for effective handling of time series. The same approach and methodologies can also be applied in managing other types of geospatial data sets. A cloud service-based geodata cube, called GeoCubes Finland, has been developed to support online delivery and analysis of most important geospatial data sets with national coverage. The main target group of the service is the academic research institutes in the country. The most significant aspects of the GeoCubes data repository include the use of multiple resolution levels, cloud-optimized file structure, and a customized, flexible content access API. Input data sets are pre-processed while being ingested into the repository to bring them into a harmonized form in aspects like georeferencing, sampling resolutions, spatial subdivision, and value encoding. All the resolution levels are created using an appropriate generalization method, selected depending on the nature of the source data set. Multiple pre-processed resolutions enable new kinds of online analysis approaches to be introduced. Analysis processes based on interactive visual exploration can be effectively carried out, as the level of resolution most close to the visual scale can always be used. In the same way, statistical analysis can be carried out on resolution levels that best reflect the scale of the phenomenon being studied. Access times remain close to constant, independent of the scale applied in the application. The cloud service-based approach, applied in the GeoCubes Finland repository, enables analysis operations to be performed on the server platform, thus making high-performance computing facilities easily accessible. The developed GeoCubes API supports this kind of approach for online analysis. The use of cloud-optimized file structures in data storage enables the fast extraction of subareas. The access API allows for the use of vector-formatted administrative areas and user-defined polygons as definitions of subareas for data retrieval. Administrative areas of the country in four levels are available readily from the GeoCubes platform. In addition to direct delivery of raster data, the service also supports the so-called virtual file format, in which only a small text file is first downloaded. The text file contains links to the raster content on the service platform. The actual raster data is downloaded on demand, from the spatial area and resolution level required in each stage of the application. By the geodata cube approach, pre-harmonized geospatial data sets are made accessible to new categories of inexperienced users in an easy-to-use manner. At the same time, the multiresolution nature of the GeoCubes repository facilitates expert users to introduce new kinds of interactive online analysis operations.

Keywords: cloud service, geodata cube, multiresolution, raster geodata

Procedia PDF Downloads 104
27947 The Effect of Microfinance on Labor Productivity of SME - The Case of Iran

Authors: Sayyed Abdolmajid Jalaee Esfand Abadi, Sepideh Samimi

Abstract:

Since one of the major difficulties to develop small manufacturing enterpriser in developing countries is the limitations of financing activities, this paper want to answer the question: “what is the role and status of micro finance in improving the labor productivity of small industries in Iran?” The results of panel data estimation show that micro finance in Iran has not yet been able to work efficiently and provide the required credit and investment. Also, reducing economy’s dependence on oil revenues reduced and increasing its reliance on domestic production and exports of industrial production can increase the productivity of workforce in Iranian small industries.

Keywords: microfinance, small manufacturing enterprises (SME), workforce productivity, Iran, panel data

Procedia PDF Downloads 391
27946 Impact of Small and Medium Enterprises on Economic Development in the Gulf Cooperation Council: Quantitative Approaches

Authors: Hanadi Al-Mubaraki, Michael Busler

Abstract:

Both in the developed and developing countries as well as Gulf Cooperation Council (GCC), the small and medium-sized enterprises (SMEs) proven to be main drivers of jobs creation and tools to accelerate economic development and economic diversification. This paper seeks to investigate and identify the strengths and weakness of SME as a veritable tool in economic development. A survey method was used to gather data from 171 SME from Gulf Cooperation Council (GCC). The research methodology uses a quantitative approach (survey) while data were collected with a structured questionnaire and analyzed with several descriptive statistics. The results of the study, therefore, will present sets of the strengths of SME in GCC such as 1) government supported local products (59%), 2) promoting SME local products rather than international products (47%), 3) reduce the legal and administrative procedures of SME establishment (46%) and weakness of SME in GCC such as: 1) lack of funding during the initial phase of the project (46%), 2) lack of liquidity during project continuity (39%), and 3) strong competition in the domestic and global market (38%). The study findings will be guidelines for academia and practitioners such as governments, policymakers, funded organizations, universities and strategic institutions for successful implementation.

Keywords: SME, economic development, GCC, strengths and weaknesses

Procedia PDF Downloads 121
27945 A Note on the Fractal Dimension of Mandelbrot Set and Julia Sets in Misiurewicz Points

Authors: O. Boussoufi, K. Lamrini Uahabi, M. Atounti

Abstract:

The main purpose of this paper is to calculate the fractal dimension of some Julia Sets and Mandelbrot Set in the Misiurewicz Points. Using Matlab to generate the Julia Sets images that match the Misiurewicz points and using a Fractal software, we were able to find different measures that characterize those fractals in textures and other features. We are actually focusing on fractal dimension and the error calculated by the software. When executing the given equation of regression or the log-log slope of image a Box Counting method is applied to the entire image, and chosen settings are available in a FracLAc Program. Finally, a comparison is done for each image corresponding to the area (boundary) where Misiurewicz Point is located.

Keywords: box counting, FracLac, fractal dimension, Julia Sets, Mandelbrot Set, Misiurewicz Points

Procedia PDF Downloads 183
27944 Spectral Anomaly Detection and Clustering in Radiological Search

Authors: Thomas L. McCullough, John D. Hague, Marylesa M. Howard, Matthew K. Kiser, Michael A. Mazur, Lance K. McLean, Johanna L. Turk

Abstract:

Radiological search and mapping depends on the successful recognition of anomalies in large data sets which contain varied and dynamic backgrounds. We present a new algorithmic approach for real-time anomaly detection which is resistant to common detector imperfections, avoids the limitations of a source template library and provides immediate, and easily interpretable, user feedback. This algorithm is based on a continuous wavelet transform for variance reduction and evaluates the deviation between a foreground measurement and a local background expectation using methods from linear algebra. We also present a technique for recognizing and visualizing spectrally similar clusters of data. This technique uses Laplacian Eigenmap Manifold Learning to perform dimensional reduction which preserves the geometric "closeness" of the data while maintaining sensitivity to outlying data. We illustrate the utility of both techniques on real-world data sets.

Keywords: radiological search, radiological mapping, radioactivity, radiation protection

Procedia PDF Downloads 669
27943 Experimental and CFD of Desgined Small Wind Turbine

Authors: Tarek A. Mekail, Walid M. A. Elmagid

Abstract:

Many researches have concentrated on improving the aerodynamic performance of wind turbine blade through testing and theoretical studies. A small wind turbine blade is designed, fabricated and tested. The power performance of small horizontal axis wind turbines is simulated in details using Computational Fluid Dynamic (CFD). The three-dimensional CFD models are presented using ANSYS-CFX v13 software for predicting the performance of a small horizontal axis wind turbine. The simulation results are compared with the experimental data measured from a small wind turbine model, which designed according to a vehicle-based test system. The analysis of wake effect and aerodynamic of the blade can be carried out when the rotational effect was simulated. Finally, comparison between experimental, numerical and analytical performance has been done. The comparison is fairly good.

Keywords: small wind turbine, CFD of wind turbine, CFD, performance of wind turbine, test of small wind turbine, wind turbine aerodynamic, 3D model

Procedia PDF Downloads 510
27942 Improving the Deficiencies in Entrepreneurship Training for Small Businesses in Emerging Markets

Authors: Eno Jah Tabogo

Abstract:

The aim of this research is to identify and examine current deficiencies in entrepreneurial training in improving the performance of small businesses in sub Saharan Africa economies. This research achieves this by examining the course content, training methods, and profiles of trainers and trainees of small business service providers in Sub Saharan Africa (SSA) to identify training deficiencies in improving small businesses. Data was for the analysis was collected from a sample of four entrepreneurial training providers in SSA. These four providers served an average of 1,500 trainees. Questionnaire was used to collect data via face to face and through telephone. Face validity was determined by distributing the questionnaire among a group of colleagues, followed by a group discussion to strengthen the validity of the questionnaire. Interviews were also held with managers of training programs. Content and descriptive statistics was used to analyse the data collected. The results indicated only 25% of the training content were entrepreneurial. In terms of service provided, both business, entrepreneurial, technical and after-care services were identified. It was also discovered that owners of training firms had no formal entrepreneurship background. The paper contributes by advocating for a comprehensive entrepreneurship-training program for successful small business enterprises. Recommendations that could help sustain emerging small business enterprises and direction for further research are presented.

Keywords: entrepreneurship, emerging markets, small business, training

Procedia PDF Downloads 110
27941 Constructing a Physics Guided Machine Learning Neural Network to Predict Tonal Noise Emitted by a Propeller

Authors: Arthur D. Wiedemann, Christopher Fuller, Kyle A. Pascioni

Abstract:

With the introduction of electric motors, small unmanned aerial vehicle designers have to consider trade-offs between acoustic noise and thrust generated. Currently, there are few low-computational tools available for predicting acoustic noise emitted by a propeller into the far-field. Artificial neural networks offer a highly non-linear and adaptive model for predicting isolated and interactive tonal noise. But neural networks require large data sets, exceeding practical considerations in modeling experimental results. A methodology known as physics guided machine learning has been applied in this study to reduce the required data set to train the network. After building and evaluating several neural networks, the best model is investigated to determine how the network successfully predicts the acoustic waveform. Lastly, a post-network transfer function is developed to remove discontinuity from the predicted waveform. Overall, methodologies from physics guided machine learning show a notable improvement in prediction performance, but additional loss functions are necessary for constructing predictive networks on small datasets.

Keywords: aeroacoustics, machine learning, propeller, rotor, neural network, physics guided machine learning

Procedia PDF Downloads 179
27940 Training a Neural Network Using Input Dropout with Aggressive Reweighting (IDAR) on Datasets with Many Useless Features

Authors: Stylianos Kampakis

Abstract:

This paper presents a new algorithm for neural networks called “Input Dropout with Aggressive Re-weighting” (IDAR) aimed specifically at datasets with many useless features. IDAR combines two techniques (dropout of input neurons and aggressive re weighting) in order to eliminate the influence of noisy features. The technique can be seen as a generalization of dropout. The algorithm is tested on two different benchmark data sets: a noisy version of the iris dataset and the MADELON data set. Its performance is compared against three other popular techniques for dealing with useless features: L2 regularization, LASSO and random forests. The results demonstrate that IDAR can be an effective technique for handling data sets with many useless features.

Keywords: neural networks, feature selection, regularization, aggressive reweighting

Procedia PDF Downloads 425
27939 Predicting the Diagnosis of Alzheimer’s Disease: Development and Validation of Machine Learning Models

Authors: Jay L. Fu

Abstract:

Patients with Alzheimer's disease progressively lose their memory and thinking skills and, eventually, the ability to carry out simple daily tasks. The disease is irreversible, but early detection and treatment can slow down the disease progression. In this research, publicly available MRI data and demographic data from 373 MRI imaging sessions were utilized to build models to predict dementia. Various machine learning models, including logistic regression, k-nearest neighbor, support vector machine, random forest, and neural network, were developed. Data were divided into training and testing sets, where training sets were used to build the predictive model, and testing sets were used to assess the accuracy of prediction. Key risk factors were identified, and various models were compared to come forward with the best prediction model. Among these models, the random forest model appeared to be the best model with an accuracy of 90.34%. MMSE, nWBV, and gender were the three most important contributing factors to the detection of Alzheimer’s. Among all the models used, the percent in which at least 4 of the 5 models shared the same diagnosis for a testing input was 90.42%. These machine learning models allow early detection of Alzheimer’s with good accuracy, which ultimately leads to early treatment of these patients.

Keywords: Alzheimer's disease, clinical diagnosis, magnetic resonance imaging, machine learning prediction

Procedia PDF Downloads 113
27938 Modified InVEST for Whatsapp Messages Forensic Triage and Search through Visualization

Authors: Agria Rhamdhan

Abstract:

WhatsApp as the most popular mobile messaging app has been used as evidence in many criminal cases. As the use of mobile messages generates large amounts of data, forensic investigation faces the challenge of large data problems. The hardest part of finding this important evidence is because current practice utilizes tools and technique that require manual analysis to check all messages. That way, analyze large sets of mobile messaging data will take a lot of time and effort. Our work offers methodologies based on forensic triage to reduce large data to manageable sets resulting easier to do detailed reviews, then show the results through interactive visualization to show important term, entities and relationship through intelligent ranking using Term Frequency-Inverse Document Frequency (TF-IDF) and Latent Dirichlet Allocation (LDA) Model. By implementing this methodology, investigators can improve investigation processing time and result's accuracy.

Keywords: forensics, triage, visualization, WhatsApp

Procedia PDF Downloads 130
27937 Spatio-Temporal Data Mining with Association Rules for Lake Van

Authors: Tolga Aydin, M. Fatih Alaeddinoğlu

Abstract:

People, throughout the history, have made estimates and inferences about the future by using their past experiences. Developing information technologies and the improvements in the database management systems make it possible to extract useful information from knowledge in hand for the strategic decisions. Therefore, different methods have been developed. Data mining by association rules learning is one of such methods. Apriori algorithm, one of the well-known association rules learning algorithms, is not commonly used in spatio-temporal data sets. However, it is possible to embed time and space features into the data sets and make Apriori algorithm a suitable data mining technique for learning spatio-temporal association rules. Lake Van, the largest lake of Turkey, is a closed basin. This feature causes the volume of the lake to increase or decrease as a result of change in water amount it holds. In this study, evaporation, humidity, lake altitude, amount of rainfall and temperature parameters recorded in Lake Van region throughout the years are used by the Apriori algorithm and a spatio-temporal data mining application is developed to identify overflows and newly-formed soil regions (underflows) occurring in the coastal parts of Lake Van. Identifying possible reasons of overflows and underflows may be used to alert the experts to take precautions and make the necessary investments.

Keywords: apriori algorithm, association rules, data mining, spatio-temporal data

Procedia PDF Downloads 343
27936 Small Entrepreneurship Supporting Economic Policy in Georgia

Authors: G. Erkomaishvili

Abstract:

This paper discusses small entrepreneurship development strategy in Georgia and the tools and regulations that will encourage development of small entrepreneurship. The current situation in the small entrepreneurship sector, as well as factors affecting growth and decline in the sector and the priorities of state support, are studied and analyzed. The objective of this research is to assess the current situation of the sector to highlight opportunities and reveal the gaps. State support of small entrepreneurship should become a key priority in the country’s economic policy, as development of the sector will ensure social, economic and political stability. Based on the research, a small entrepreneurship development strategy is presented; corresponding conclusions are made and recommendations are developed.

Keywords: economic policy for small entrepreneurship development, small entrepreneurship, regulations, small entrepreneurship development strategy

Procedia PDF Downloads 446
27935 Big Data: Concepts, Technologies and Applications in the Public Sector

Authors: A. Alexandru, C. A. Alexandru, D. Coardos, E. Tudora

Abstract:

Big Data (BD) is associated with a new generation of technologies and architectures which can harness the value of extremely large volumes of very varied data through real time processing and analysis. It involves changes in (1) data types, (2) accumulation speed, and (3) data volume. This paper presents the main concepts related to the BD paradigm, and introduces architectures and technologies for BD and BD sets. The integration of BD with the Hadoop Framework is also underlined. BD has attracted a lot of attention in the public sector due to the newly emerging technologies that allow the availability of network access. The volume of different types of data has exponentially increased. Some applications of BD in the public sector in Romania are briefly presented.

Keywords: big data, big data analytics, Hadoop, cloud

Procedia PDF Downloads 278
27934 Building 1-Well-Covered Graphs by Corona, Join, and Rooted Product of Graphs

Authors: Vadim E. Levit, Eugen Mandrescu

Abstract:

A graph is well-covered if all its maximal independent sets are of the same size. A well-covered graph is 1-well-covered if deletion of every vertex of the graph leaves it well-covered. It is known that a graph without isolated vertices is 1-well-covered if and only if every two disjoint independent sets are included in two disjoint maximum independent sets. Well-covered graphs are related to combinatorial commutative algebra (e.g., every Cohen-Macaulay graph is well-covered, while each Gorenstein graph without isolated vertices is 1-well-covered). Our intent is to construct several infinite families of 1-well-covered graphs using the following known graph operations: corona, join, and rooted product of graphs. Adopting some known techniques used to advantage for well-covered graphs, one can prove that: if the graph G has no isolated vertices, then the corona of G and H is 1-well-covered if and only if H is a complete graph of order two at least; the join of the graphs G and H is 1-well-covered if and only if G and H have the same independence number and both are 1-well-covered; if H satisfies the property that every three pairwise disjoint independent sets are included in three pairwise disjoint maximum independent sets, then the rooted product of G and H is 1-well-covered, for every graph G. These findings show not only how to generate some more families of 1-well-covered graphs, but also that, to this aim, sometimes, one may use graphs that are not necessarily 1-well-covered.

Keywords: maximum independent set, corona, concatenation, join, well-covered graph

Procedia PDF Downloads 174
27933 A Less Complexity Deep Learning Method for Drones Detection

Authors: Mohamad Kassab, Amal El Fallah Seghrouchni, Frederic Barbaresco, Raed Abu Zitar

Abstract:

Detecting objects such as drones is a challenging task as their relative size and maneuvering capabilities deceive machine learning models and cause them to misclassify drones as birds or other objects. In this work, we investigate applying several deep learning techniques to benchmark real data sets of flying drones. A deep learning paradigm is proposed for the purpose of mitigating the complexity of those systems. The proposed paradigm consists of a hybrid between the AdderNet deep learning paradigm and the Single Shot Detector (SSD) paradigm. The goal was to minimize multiplication operations numbers in the filtering layers within the proposed system and, hence, reduce complexity. Some standard machine learning technique, such as SVM, is also tested and compared to other deep learning systems. The data sets used for training and testing were either complete or filtered in order to remove the images with mall objects. The types of data were RGB or IR data. Comparisons were made between all these types, and conclusions were presented.

Keywords: drones detection, deep learning, birds versus drones, precision of detection, AdderNet

Procedia PDF Downloads 145
27932 LEGO Bricks and Creativity: A Comparison between Classic and Single Sets

Authors: Maheen Zia

Abstract:

Near the early twenty-first century, LEGO decided to diversify its product range which resulted in more specific and single-outcome sets occupying the store shelves than classic kits having fairly all-purpose bricks. Earlier, LEGOs came with more bricks and lesser instructions. Today, there are more single kits being produced and sold, which come with a strictly defined set of guidelines. If one set is used to make a car, the same bricks cannot be put together to produce any other article. Earlier, multiple bricks gave children a chance to be imaginative, think of new items and construct them (by just putting the same pieces differently). The new products are less open-ended and offer a limited possibility for players in both designing and realizing those designs. The article reviews (in the light of existing research) how classic LEGO sets could help enhance a child’s creativity in comparison with single sets, which allow a player to interact (not experiment) with the bricks.

Keywords: constructive play, creativity, LEGO, play-based learning

Procedia PDF Downloads 165
27931 Feature Selection of Personal Authentication Based on EEG Signal for K-Means Cluster Analysis Using Silhouettes Score

Authors: Jianfeng Hu

Abstract:

Personal authentication based on electroencephalography (EEG) signals is one of the important field for the biometric technology. More and more researchers have used EEG signals as data source for biometric. However, there are some disadvantages for biometrics based on EEG signals. The proposed method employs entropy measures for feature extraction from EEG signals. Four type of entropies measures, sample entropy (SE), fuzzy entropy (FE), approximate entropy (AE) and spectral entropy (PE), were deployed as feature set. In a silhouettes calculation, the distance from each data point in a cluster to all another point within the same cluster and to all other data points in the closest cluster are determined. Thus silhouettes provide a measure of how well a data point was classified when it was assigned to a cluster and the separation between them. This feature renders silhouettes potentially well suited for assessing cluster quality in personal authentication methods. In this study, “silhouettes scores” was used for assessing the cluster quality of k-means clustering algorithm is well suited for comparing the performance of each EEG dataset. The main goals of this study are: (1) to represent each target as a tuple of multiple feature sets, (2) to assign a suitable measure to each feature set, (3) to combine different feature sets, (4) to determine the optimal feature weighting. Using precision/recall evaluations, the effectiveness of feature weighting in clustering was analyzed. EEG data from 22 subjects were collected. Results showed that: (1) It is possible to use fewer electrodes (3-4) for personal authentication. (2) There was the difference between each electrode for personal authentication (p<0.01). (3) There is no significant difference for authentication performance among feature sets (except feature PE). Conclusion: The combination of k-means clustering algorithm and silhouette approach proved to be an accurate method for personal authentication based on EEG signals.

Keywords: personal authentication, K-mean clustering, electroencephalogram, EEG, silhouettes

Procedia PDF Downloads 255
27930 Audit of TPS photon beam dataset for small field output factors using OSLDs against RPC standard dataset

Authors: Asad Yousuf

Abstract:

Purpose: The aim of the present study was to audit treatment planning system beam dataset for small field output factors against standard dataset produced by radiological physics center (RPC) from a multicenter study. Such data are crucial for validity of special techniques, i.e., IMRT or stereotactic radiosurgery. Materials/Method: In this study, multiple small field size output factor datasets were measured and calculated for 6 to 18 MV x-ray beams using the RPC recommend methods. These beam datasets were measured at 10 cm depth for 10 × 10 cm2 to 2 × 2 cm2 field sizes, defined by collimator jaws at 100 cm. The measurements were made with a Landauer’s nanoDot OSLDs whose volume is small enough to gather a full ionization reading even for the 1×1 cm2 field size. At our institute the beam data including output factors have been commissioned at 5 cm depth with an SAD setup. For comparison with the RPC data, the output factors were converted to an SSD setup using tissue phantom ratios. SSD setup also enables coverage of the ion chamber in 2×2 cm2 field size. The measured output factors were also compared with those calculated by Eclipse™ treatment planning software. Result: The measured and calculated output factors are in agreement with RPC dataset within 1% and 4% respectively. The large discrepancies in TPS reflect the increased challenge in converting measured data into a commissioned beam model for very small fields. Conclusion: OSLDs are simple, durable, and accurate tool to verify doses that delivered using small photon beam fields down to a 1x1 cm2 field sizes. The study emphasizes that the treatment planning system should always be evaluated for small field out factors for the accurate dose delivery in clinical setting.

Keywords: small field dosimetry, optically stimulated luminescence, audit treatment, radiological physics center

Procedia PDF Downloads 299
27929 Stability Assessment of Chamshir Dam Based on DEM, South West Zagros

Authors: Rezvan Khavari

Abstract:

The Zagros fold-thrust belt in SW Iran is a part of the Alpine-Himalayan system which consists of a variety of structures with different sizes or geometries. The study area is Chamshir Dam, which is located on the Zohreh River, 20 km southeast of Gachsaran City (southwest Iran). The satellite images are valuable means available to geologists for locating geological or geomorphological features expressing regional fault or fracture systems, therefore, the satellite images were used for structural analysis of the Chamshir dam area. As well, using the DEM and geological maps, 3D Models of the area have been constructed. Then, based on these models, all the acquired fracture traces data were integrated in Geographic Information System (GIS) environment by using Arc GIS software. Based on field investigation and DEM model, main structures in the area consist of Cham Shir syncline and two fault sets, the main thrust faults with NW-SE direction and small normal faults in NE-SW direction. There are three joint sets in the study area, both of them (J1 and J3) are the main large fractures around the Chamshir dam. These fractures indeed consist with the normal faults in NE-SW direction. The third joint set in NW-SE is normal to the others. In general, according to topography, geomorphology and structural geology evidences, Chamshir dam has a potential for sliding in some parts of Gachsaran formation.

Keywords: DEM, chamshir dam, zohreh river, satellite images

Procedia PDF Downloads 460