Search results for: predictive mining
Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 2019

Search results for: predictive mining

1329 The Study of Power as a Pertinent Motive among Tribal College Students of Assam

Authors: K. P. Gogoi

Abstract:

The current research study investigates the motivational pattern viz Power motivation among the tribal college students of Assam. The sample consisted of 240 college students (120 tribal and 120 non-tribal) ranging from 18-24 years, 60 males and 60 females for both tribal’s and non-tribal’s. Attempts were made to include all the prominent tribes of Assam viz. Thematic Apperception Test, Power motive Scale and a semi structured interview schedule were used to gather information about their family types, parental deprivation, parental relations, social and political belongingness. Mean, Standard Deviation, and t-test were the statistical measures adopted in this 2x2 factorial design study. In addition to this discriminant analysis has been worked out to strengthen the predictive validity of the obtained data. TAT scores reveal significant difference between the tribal’s and non-tribal on power motivation. However results obtained on gender difference indicates similar scores among both the cultures. Cross validation of the TAT results was done by using the power motive scale by T. S. Dapola which confirms the results on need for power through TAT scores. Power motivation has been studied in three directions i.e. coercion, inducement and restraint. An interesting finding is that on coercion tribal’s score high showing significant difference whereas in inducement or seduction the non-tribal’s scored high showing significant difference. On the other hand on restraint no difference exists between both cultures. Discriminant analysis has been worked out between the variables n-power, coercion, inducement and restraint. Results indicated that inducement or seduction (.502) is the dependent measure which has the most discriminating power between these two cultures.

Keywords: power motivation, tribal, social, political, predictive validity, cross validation, coercion, inducement, restraint

Procedia PDF Downloads 478
1328 Role of Imaging in Predicting the Receptor Positivity Status in Lung Adenocarcinoma: A Chapter in Radiogenomics

Authors: Sonal Sethi, Mukesh Yadav, Abhimanyu Gupta

Abstract:

The upcoming field of radiogenomics has the potential to upgrade the role of imaging in lung cancer management by noninvasive characterization of tumor histology and genetic microenvironment. Receptor positivity like epidermal growth factor receptor (EGFR) and anaplastic lymphoma kinase (ALK) genotyping are critical in lung adenocarcinoma for treatment. As conventional identification of receptor positivity is an invasive procedure, we analyzed the features on non-invasive computed tomography (CT), which predicts the receptor positivity in lung adenocarcinoma. Retrospectively, we did a comprehensive study from 77 proven lung adenocarcinoma patients with CT images, EGFR and ALK receptor genotyping, and clinical information. Total 22/77 patients were receptor-positive (15 had only EGFR mutation, 6 had ALK mutation, and 1 had both EGFR and ALK mutation). Various morphological characteristics and metastatic distribution on CT were analyzed along with the clinical information. Univariate and multivariable logistic regression analyses were used. On multivariable logistic regression analysis, we found spiculated margin, lymphangitic spread, air bronchogram, pleural effusion, and distant metastasis had a significant predictive value for receptor mutation status. On univariate analysis, air bronchogram and pleural effusion had significant individual predictive value. Conclusions: Receptor positive lung cancer has characteristic imaging features compared with nonreceptor positive lung adenocarcinoma. Since CT is routinely used in lung cancer diagnosis, we can predict the receptor positivity by a noninvasive technique and would follow a more aggressive algorithm for evaluation of distant metastases as well as for the treatment.

Keywords: lung cancer, multidisciplinary cancer care, oncologic imaging, radiobiology

Procedia PDF Downloads 131
1327 A Systematic Review Investigating the Use of EEG Measures in Neuromarketing

Authors: A. M. Byrne, E. Bonfiglio, C. Rigby, N. Edelstyn

Abstract:

Introduction: Neuromarketing employs numerous methodologies when investigating products and advertisement effectiveness. Electroencephalography (EEG), a non-invasive measure of electrical activity from the brain, is commonly used in neuromarketing. EEG data can be considered using time-frequency (TF) analysis, where changes in the frequency of brainwaves are calculated to infer participant’s mental states, or event-related potential (ERP) analysis, where changes in amplitude are observed in direct response to a stimulus. This presentation discusses the findings of a systematic review of EEG measures in neuromarketing. A systematic review summarises evidence on a research question, using explicit measures to identify, select, and critically appraise relevant research papers. Thissystematic review identifies which EEG measures are the most robust predictor of customer preference and purchase intention. Methods: Search terms identified174 papers that used EEG in combination with marketing-related stimuli. Publications were excluded if they were written in a language other than English or were not published as journal articles (e.g., book chapters). The review investigated which TF effect (e.g., theta-band power) and ERP component (e.g., N400) most consistently reflected preference and purchase intention. Machine-learning prediction was also investigated, along with the use of EEG combined with physiological measures such as eye-tracking. Results: Frontal alpha asymmetry was the most reliable TF signal, where an increase in activity over the left side of the frontal lobe indexed a positive response to marketing stimuli, while an increase in activity over the right side indexed a negative response. The late positive potential, a positive amplitude increase around 600 ms after stimulus presentation, was the most reliable ERP component, reflecting the conscious emotional evaluation of marketing stimuli. However, each measure showed mixed results when related to preference and purchase behaviour. Predictive accuracy was greatly improved through machine-learning algorithms such as deep neural networks, especially when combined with eye-tracking or facial expression analyses. Discussion: This systematic review provides a novel catalogue of the most effective use of each EEG measure commonly used in neuromarketing. Exciting findings to emerge are the identification of the frontal alpha asymmetry and late positive potential as markers of preferential responses to marketing stimuli. Predictive accuracy using machine-learning algorithms achieved predictive accuracies as high as 97%, and future research should therefore focus on machine-learning prediction when using EEG measures in neuromarketing.

Keywords: EEG, ERP, neuromarketing, machine-learning, systematic review, time-frequency

Procedia PDF Downloads 108
1326 Ecological Risk Aspects of Essential Trace Metals in Soil Derived From Gold Mining Region, South Africa

Authors: Lowanika Victor Tibane, David Mamba

Abstract:

Human body, animals, and plants depend on certain essential metals in permissible quantities for their survival. Excessive metal concentration may cause severe malfunctioning of the organisms and even fatal in extreme cases. Because of gold mining in the Witwatersrand basin in South Africa, enormous untreated mine dumps comprise elevated concentration of essential trace elements. Elevated quantities of trace metal have direct negative impact on the quality of soil for different land use types, reduce soil efficiency for plant growth, and affect the health human and animals. A total of 21 subsoil samples were examined using inductively coupled plasma optical emission spectrometry and X-ray fluorescence methods and the results elevated men concentration of Fe (36,433.39) > S (5,071.83) > Cu (1,717,28) > Mn (612.81) > Cr (74.52) > Zn (68.67) > Ni (40.44) > Co (9.63) > P (3.49) > Mo > (2.74), reported in mg/kg. Using various contamination indices, it was discovered that the sites surveyed are on average moderately contaminated with Co, Cr, Cu, Mn, Ni, S, and Zn. The ecological risk assessment revealed a low ecological risk for Cr, Ni and Zn, whereas Cu poses a very high ecological risk.

Keywords: essential trace elements, soil contamination, contamination indices, toxicity, descriptive statistics, ecological risk evaluation

Procedia PDF Downloads 88
1325 Factors Affecting Visual Environment in Mine Lighting

Authors: N. Lakshmipathy, Ch. S. N. Murthy, M. Aruna

Abstract:

The design of lighting systems for surface mines is not an easy task because of the unique environment and work procedures encountered in the mines. The primary objective of this paper is to identify the major problems encountered in mine lighting application and to provide guidance in the solution of these problems. In the surface mining reflectance of surrounding surfaces is one of the important factors, which improve the vision, in the night hours. But due to typical working nature in the mines it is very difficult to fulfill these requirements, and also the orientation of the light at work site is a challenging task. Due to this reason machine operator and other workers in a mine need to be able to orient themselves in a difficult visual environment. The haul roads always keep on changing to tune with the mining activity. Other critical area such as dumpyards, stackyards etc. also change their phase with time, and it is difficult to illuminate such areas. Mining is a hazardous occupation, with workers exposed to adverse conditions; apart from the need for hard physical labor, there is exposure to stress and environmental pollutants like dust, noise, heat, vibration, poor illumination, radiation, etc. Visibility is restricted when operating load haul dumper and Heavy Earth Moving Machinery (HEMM) vehicles resulting in a number of serious accidents. one of the leading causes of these accidents is the inability of the equipment operator to see clearly people, objects or hazards around the machine. Results indicate blind spots are caused primarily by posts, the back of the operator's cab, and by lights and light brackets. The careful designed and implemented, lighting systems provide mine workers improved visibility and contribute to improved safety, productivity and morale. Properly designed lighting systems can improve visibility and safety during working in the opencast mines.

Keywords: contrast, efficacy, illuminance, illumination, light, luminaire, luminance, reflectance, visibility

Procedia PDF Downloads 357
1324 Using the Transtheoretical Model to Investigate Stages of Change in Regular Volunteer Service among Seniors in Community

Authors: Pei-Ti Hsu, I-Ju Chen, Jeu-Jung Chen, Cheng-Fen Chang, Shiu-Yan Yang

Abstract:

Taiwan now is an aging society Research on the elderly should not be confined to caring for seniors, but should also be focused on ways to improve health and the quality of life. Senior citizens who participate in volunteer services could become less lonely, have new growth opportunities, and regain a sense of accomplishment. Thus, the question of how to get the elderly to participate in volunteer service is worth exploring. Apply the Transtheoretical Model to understand stages of change in regular volunteer service and voluntary service behaviour among the seniors. 1525 adults over the age of 65 from the Renai district of Keelung City were interviewed. The research tool was a self-constructed questionnaire and individual interviews were conducted to collect data. Then the data was processed and analyzed using the IBM SPSS Statistics 20 (Windows version) statistical software program. In the past six months, research subjects averaged 9.92 days of volunteer services. A majority of these elderly individuals had no intention to change their regular volunteer services. We discovered that during the maintenance stage, the self-efficacy for volunteer services was higher than during all other stages, but self-perceived barriers were less during the preparation stage and action stage. Self-perceived benefits were found to have an important predictive power for those with regular volunteer service behaviors in the previous stage, and self-efficacy was found to have an important predictive power for those with regular volunteer service behaviors in later stages. The research results support the conclusion that community nursing staff should group elders based on their regular volunteer services change stages and design appropriate behavioral change strategies.

Keywords: seniors, stages of change in regular volunteer services, volunteer service behavior, self-efficacy, self-perceived benefits

Procedia PDF Downloads 425
1323 Statistically Accurate Synthetic Data Generation for Enhanced Traffic Predictive Modeling Using Generative Adversarial Networks and Long Short-Term Memory

Authors: Srinivas Peri, Siva Abhishek Sirivella, Tejaswini Kallakuri, Uzair Ahmad

Abstract:

Effective traffic management and infrastructure planning are crucial for the development of smart cities and intelligent transportation systems. This study addresses the challenge of data scarcity by generating realistic synthetic traffic data using the PeMS-Bay dataset, improving the accuracy and reliability of predictive modeling. Advanced synthetic data generation techniques, including TimeGAN, GaussianCopula, and PAR Synthesizer, are employed to produce synthetic data that replicates the statistical and structural characteristics of real-world traffic. Future integration of Spatial-Temporal Generative Adversarial Networks (ST-GAN) is planned to capture both spatial and temporal correlations, further improving data quality and realism. The performance of each synthetic data generation model is evaluated against real-world data to identify the best models for accurately replicating traffic patterns. Long Short-Term Memory (LSTM) networks are utilized to model and predict complex temporal dependencies within traffic patterns. This comprehensive approach aims to pinpoint areas with low vehicle counts, uncover underlying traffic issues, and inform targeted infrastructure interventions. By combining GAN-based synthetic data generation with LSTM-based traffic modeling, this study supports data-driven decision-making that enhances urban mobility, safety, and the overall efficiency of city planning initiatives.

Keywords: GAN, long short-term memory, synthetic data generation, traffic management

Procedia PDF Downloads 17
1322 The Role of Artificial Intelligence in Criminal Procedure

Authors: Herke Csongor

Abstract:

The artificial intelligence (AI) has been used in the United States of America in the decisionmaking process of the criminal justice system for decades. In the field of law, including criminal law, AI can provide serious assistance in decision-making in many places. The paper reviews four main areas where AI still plays a role in the criminal justice system and where it is expected to play an increasingly important role. The first area is the predictive policing: a number of algorithms are used to prevent the commission of crimes (by predicting potential crime locations or perpetrators). This may include the so-called linking hot-spot analysis, crime linking and the predictive coding. The second area is the Big Data analysis: huge amounts of data sets are already opaque to human activity and therefore unprocessable. Law is one of the largest producers of digital documents (because not only decisions, but nowadays the entire document material is available digitally), and this volume can only and exclusively be handled with the help of computer programs, which the development of AI systems can have an increasing impact on. The third area is the criminal statistical data analysis. The collection of statistical data using traditional methods required enormous human resources. The AI is a huge step forward in that it can analyze the database itself, based on the requested aspects, a collection according to any aspect can be available in a few seconds, and the AI itself can analyze the database and indicate if it finds an important connection either from the point of view of crime prevention or crime detection. Finally, the use of AI during decision-making in both investigative and judicial fields is analyzed in detail. While some are skeptical about the future role of AI in decision-making, many believe that the question is not whether AI will participate in decision-making, but only when and to what extent it will transform the current decision-making system.

Keywords: artificial intelligence, international criminal cooperation, planning and organizing of the investigation, risk assessment

Procedia PDF Downloads 34
1321 Clinical and Radiological Features of Adenomyosis and Its Histopathological Correlation

Authors: Surabhi Agrawal Kohli, Sunita Gupta, Esha Khanuja, Parul Garg, P. Gupta

Abstract:

Background: Adenomyosis is a common gynaecological condition that affects the menstruating women. Uterine enlargement, dysmenorrhoea, and menorrhagia are regarded as the cardinal clinical symptoms of adenomyosis. Classically it was thought, compared with ultrasonography, when adenomyosis is suspected, MRI enables more accurate diagnosis of the disease. Materials and Methods: 172 subjects were enrolled after an informed consent that had complaints of HMB, dyspareunia, dysmenorrhea, and chronic pelvic pain. Detailed history of the enrolled subjects was taken, followed by a clinical examination. These patients were then subjected to TVS where myometrial echo texture, presence of myometrial cysts, blurring of endomyometrial junction was noted. MRI was followed which noted the presence of junctional zone thickness and myometrial cysts. After hysterectomy, histopathological diagnosis was obtained. Results: 78 participants were analysed. The mean age was 44.2 years. 43.5% had parity of 4 or more. heavy menstrual bleeding (HMB) was present in 97.8% and dysmenorrhea in 93.48 % of HPE positive patient. Transvaginal sonography (TVS) and MRI had a sensitivity of 89.13% and 80.43%, specificity of 90.62% and 84.37%, positive likelihood ratio of 9.51 and 5.15, negative likelihood ratio of 0.12 and 0.23, positive predictive value of 93.18% and 88.1%, negative predictive value of 85.29% and 75% and a diagnostic accuracy of 89.74% and 82.5%. Comparison of sensitivity (p=0.289) and specificity (p=0.625) showed no statistically significant difference between TVS and MRI. Conclusion: Prevalence of 30.23%. HMB with dysmenorrhoea and chronic pelvic pain helps in diagnosis. TVS (Endomyometrial junction blurring) is both sensitive and specific in diagnosing adenomyosis without need for additional diagnostic tool. Both TVS and MRI are equally efficient, however because of certain additional advantages of TVS over MRI, it may be used as the first choice of imaging. MRI may be used additionally in difficult cases as well as in patients with existing co-pathologies.

Keywords: adenomyosis, heavy menstrual bleeding, MRI, TVS

Procedia PDF Downloads 489
1320 An Improved K-Means Algorithm for Gene Expression Data Clustering

Authors: Billel Kenidra, Mohamed Benmohammed

Abstract:

Data mining technique used in the field of clustering is a subject of active research and assists in biological pattern recognition and extraction of new knowledge from raw data. Clustering means the act of partitioning an unlabeled dataset into groups of similar objects. Each group, called a cluster, consists of objects that are similar between themselves and dissimilar to objects of other groups. Several clustering methods are based on partitional clustering. This category attempts to directly decompose the dataset into a set of disjoint clusters leading to an integer number of clusters that optimizes a given criterion function. The criterion function may emphasize a local or a global structure of the data, and its optimization is an iterative relocation procedure. The K-Means algorithm is one of the most widely used partitional clustering techniques. Since K-Means is extremely sensitive to the initial choice of centers and a poor choice of centers may lead to a local optimum that is quite inferior to the global optimum, we propose a strategy to initiate K-Means centers. The improved K-Means algorithm is compared with the original K-Means, and the results prove how the efficiency has been significantly improved.

Keywords: microarray data mining, biological pattern recognition, partitional clustering, k-means algorithm, centroid initialization

Procedia PDF Downloads 187
1319 Sampling and Chemical Characterization of Particulate Matter in a Platinum Mine

Authors: Juergen Orasche, Vesta Kohlmeier, George C. Dragan, Gert Jakobi, Patricia Forbes, Ralf Zimmermann

Abstract:

Underground mining poses a difficult environment for both man and machines. At more than 1000 meters underneath the surface of the earth, ores and other mineral resources are still gained by conventional and motorised mining. Adding to the hazards caused by blasting and stone-chipping, the working conditions are best described by the high temperatures of 35-40°C and high humidity, at low air exchange rates. Separate ventilation shafts lead fresh air into a mine and others lead expended air back to the surface. This is essential for humans and machines working deep underground. Nevertheless, mines are widely ramified. Thus the air flow rate at the far end of a tunnel is sensed to be close to zero. In recent years, conventional mining was supplemented by mining with heavy diesel machines. These very flat machines called Load Haul Dump (LHD) vehicles accelerate and ease work in areas favourable for heavy machines. On the other hand, they emit non-filtered diesel exhaust, which constitutes an occupational hazard for the miners. Combined with a low air exchange, high humidity and inorganic dust from the mining it leads to 'black smog' underneath the earth. This work focuses on the air quality in mines employing LHDs. Therefore we performed personal sampling (samplers worn by miners during their work), stationary sampling and aethalometer (Microaeth MA200, Aethlabs) measurements in a platinum mine in around 1000 meters under the earth’s surface. We compared areas of high diesel exhaust emission with areas of conventional mining where no diesel machines were operated. For a better assessment of health risks caused by air pollution we applied a separated gas-/particle-sampling tool (or system), with first denuder section collecting intermediate VOCs. These multi-channel silicone rubber denuders are able to trap IVOCs while allowing particles ranged from 10 nm to 1 µm in diameter to be transmitted with an efficiency of nearly 100%. The second section is represented by a quartz fibre filter collecting particles and adsorbed semi-volatile organic compounds (SVOC). The third part is a graphitized carbon black adsorber – collecting the SVOCs that evaporate from the filter. The compounds collected on these three sections were analyzed in our labs with different thermal desorption techniques coupled with gas chromatography and mass spectrometry (GC-MS). VOCs and IVOCs were measured with a Shimadzu Thermal Desorption Unit (TD20, Shimadzu, Japan) coupled to a GCMS-System QP 2010 Ultra with a quadrupole mass spectrometer (Shimadzu). The GC was equipped with a 30m, BP-20 wax column (0.25mm ID, 0.25µm film) from SGE (Australia). Filters were analyzed with In-situ derivatization thermal desorption gas chromatography time-of-flight-mass spectrometry (IDTD-GC-TOF-MS). The IDTD unit is a modified GL sciences Optic 3 system (GL Sciences, Netherlands). The results showed black carbon concentrations measured with the portable aethalometers up to several mg per m³. The organic chemistry was dominated by very high concentrations of alkanes. Typical diesel engine exhaust markers like alkylated polycyclic aromatic hydrocarbons were detected as well as typical lubrication oil markers like hopanes.

Keywords: diesel emission, personal sampling, aethalometer, mining

Procedia PDF Downloads 152
1318 A General Framework for Measuring the Internal Fraud Risk of an Enterprise Resource Planning System

Authors: Imran Dayan, Ashiqul Khan

Abstract:

Internal corporate fraud, which is fraud carried out by internal stakeholders of a company, affects the well-being of the organisation just like its external counterpart. Even if such an act is carried out for the short-term benefit of a corporation, the act is ultimately harmful to the entity in the long run. Internal fraud is often carried out by relying upon aberrations from usual business processes. Business processes are the lifeblood of a company in modern managerial context. Such processes are developed and fine-tuned over time as a corporation grows through its life stages. Modern corporations have embraced technological innovations into their business processes, and Enterprise Resource Planning (ERP) systems being at the heart of such business processes is a testimony to that. Since ERP systems record a huge amount of data in their event logs, the logs are a treasure trove for anyone trying to detect any sort of fraudulent activities hidden within the day-to-day business operations and processes. This research utilises the ERP systems in place within corporations to assess the likelihood of prospective internal fraud through developing a framework for measuring the risks of fraud through Process Mining techniques and hence finds risky designs and loose ends within these business processes. This framework helps not only in identifying existing cases of fraud in the records of the event log, but also signals the overall riskiness of certain business processes, and hence draws attention for carrying out a redesign of such processes to reduce the chance of future internal fraud while improving internal control within the organisation. The research adds value by applying the concepts of Process Mining into the analysis of data from modern day applications of business process records, which is the ERP event logs, and develops a framework that should be useful to internal stakeholders for strengthening internal control as well as provide external auditors with a tool of use in case of suspicion. The research proves its usefulness through a few case studies conducted with respect to big corporations with complex business processes and an ERP in place.

Keywords: enterprise resource planning, fraud risk framework, internal corporate fraud, process mining

Procedia PDF Downloads 327
1317 Clustering Categorical Data Using the K-Means Algorithm and the Attribute’s Relative Frequency

Authors: Semeh Ben Salem, Sami Naouali, Moetez Sallami

Abstract:

Clustering is a well known data mining technique used in pattern recognition and information retrieval. The initial dataset to be clustered can either contain categorical or numeric data. Each type of data has its own specific clustering algorithm. In this context, two algorithms are proposed: the k-means for clustering numeric datasets and the k-modes for categorical datasets. The main encountered problem in data mining applications is clustering categorical dataset so relevant in the datasets. One main issue to achieve the clustering process on categorical values is to transform the categorical attributes into numeric measures and directly apply the k-means algorithm instead the k-modes. In this paper, it is proposed to experiment an approach based on the previous issue by transforming the categorical values into numeric ones using the relative frequency of each modality in the attributes. The proposed approach is compared with a previously method based on transforming the categorical datasets into binary values. The scalability and accuracy of the two methods are experimented. The obtained results show that our proposed method outperforms the binary method in all cases.

Keywords: clustering, unsupervised learning, pattern recognition, categorical datasets, knowledge discovery, k-means

Procedia PDF Downloads 255
1316 Framework for Integrating Big Data and Thick Data: Understanding Customers Better

Authors: Nikita Valluri, Vatcharaporn Esichaikul

Abstract:

With the popularity of data-driven decision making on the rise, this study focuses on providing an alternative outlook towards the process of decision-making. Combining quantitative and qualitative methods rooted in the social sciences, an integrated framework is presented with a focus on delivering a much more robust and efficient approach towards the concept of data-driven decision-making with respect to not only Big data but also 'Thick data', a new form of qualitative data. In support of this, an example from the retail sector has been illustrated where the framework is put into action to yield insights and leverage business intelligence. An interpretive approach to analyze findings from both kinds of quantitative and qualitative data has been used to glean insights. Using traditional Point-of-sale data as well as an understanding of customer psychographics and preferences, techniques of data mining along with qualitative methods (such as grounded theory, ethnomethodology, etc.) are applied. This study’s final goal is to establish the framework as a basis for providing a holistic solution encompassing both the Big and Thick aspects of any business need. The proposed framework is a modified enhancement in lieu of traditional data-driven decision-making approach, which is mainly dependent on quantitative data for decision-making.

Keywords: big data, customer behavior, customer experience, data mining, qualitative methods, quantitative methods, thick data

Procedia PDF Downloads 157
1315 Exploring Gaming-Learning Interaction in MMOG Using Data Mining Methods

Authors: Meng-Tzu Cheng, Louisa Rosenheck, Chen-Yen Lin, Eric Klopfer

Abstract:

The purpose of the research is to explore some of the ways in which gameplay data can be analyzed to yield results that feedback into the learning ecosystem. Back-end data for all users as they played an MMOG, The Radix Endeavor, was collected, and this study reports the analyses on a specific genetics quest by using the data mining techniques, including the decision tree method. In the study, different reasons for quest failure between participants who eventually succeeded and who never succeeded were revealed. Regarding the in-game tools use, trait examiner was a key tool in the quest completion process. Subsequently, the results of decision tree showed that a lack of trait examiner usage can be made up with additional Punnett square uses, displaying multiple pathways to success in this quest. The methods of analysis used in this study and the resulting usage patterns indicate some useful ways that gameplay data can provide insights in two main areas. The first is for game designers to know how players are interacting with and learning from their game. The second is for players themselves as well as their teachers to get information on how they are progressing through the game, and to provide help they may need based on strategies and misconceptions identified in the data.

Keywords: MMOG, decision tree, genetics, gaming-learning interaction

Procedia PDF Downloads 353
1314 Exploring the Correlation between Population Distribution and Urban Heat Island under Urban Data: Taking Shenzhen Urban Heat Island as an Example

Authors: Wang Yang

Abstract:

Shenzhen is a modern city of China's reform and opening-up policy, the development of urban morphology has been established on the administration of the Chinese government. This city`s planning paradigm is primarily affected by the spatial structure and human behavior. The subjective urban agglomeration center is divided into several groups and centers. In comparisons of this effect, the city development law has better to be neglected. With the continuous development of the internet, extensive data technology has been introduced in China. Data mining and data analysis has become important tools in municipal research. Data mining has been utilized to improve data cleaning such as receiving business data, traffic data and population data. Prior to data mining, government data were collected by traditional means, then were analyzed using city-relationship research, delaying the timeliness of urban development, especially for the contemporary city. Data update speed is very fast and based on the Internet. The city's point of interest (POI) in the excavation serves as data source affecting the city design, while satellite remote sensing is used as a reference object, city analysis is conducted in both directions, the administrative paradigm of government is broken and urban research is restored. Therefore, the use of data mining in urban analysis is very important. The satellite remote sensing data of the Shenzhen city in July 2018 were measured by the satellite Modis sensor and can be utilized to perform land surface temperature inversion, and analyze city heat island distribution of Shenzhen. This article acquired and classified the data from Shenzhen by using Data crawler technology. Data of Shenzhen heat island and interest points were simulated and analyzed in the GIS platform to discover the main features of functional equivalent distribution influence. Shenzhen is located in the east-west area of China. The city’s main streets are also determined according to the direction of city development. Therefore, it is determined that the functional area of the city is also distributed in the east-west direction. The urban heat island can express the heat map according to the functional urban area. Regional POI has correspondence. The research result clearly explains that the distribution of the urban heat island and the distribution of urban POIs are one-to-one correspondence. Urban heat island is primarily influenced by the properties of the underlying surface, avoiding the impact of urban climate. Using urban POIs as analysis object, the distribution of municipal POIs and population aggregation are closely connected, so that the distribution of the population corresponded with the distribution of the urban heat island.

Keywords: POI, satellite remote sensing, the population distribution, urban heat island thermal map

Procedia PDF Downloads 101
1313 Benchmarking Machine Learning Approaches for Forecasting Hotel Revenue

Authors: Rachel Y. Zhang, Christopher K. Anderson

Abstract:

A critical aspect of revenue management is a firm’s ability to predict demand as a function of price. Historically hotels have used simple time series models (regression and/or pick-up based models) owing to the complexities of trying to build casual models of demands. Machine learning approaches are slowly attracting attention owing to their flexibility in modeling relationships. This study provides an overview of approaches to forecasting hospitality demand – focusing on the opportunities created by machine learning approaches, including K-Nearest-Neighbors, Support vector machine, Regression Tree, and Artificial Neural Network algorithms. The out-of-sample performances of above approaches to forecasting hotel demand are illustrated by using a proprietary sample of the market level (24 properties) transactional data for Las Vegas NV. Causal predictive models can be built and evaluated owing to the availability of market level (versus firm level) data. This research also compares and contrast model accuracy of firm-level models (i.e. predictive models for hotel A only using hotel A’s data) to models using market level data (prices, review scores, location, chain scale, etc… for all hotels within the market). The prospected models will be valuable for hotel revenue prediction given the basic characters of a hotel property or can be applied in performance evaluation for an existed hotel. The findings will unveil the features that play key roles in a hotel’s revenue performance, which would have considerable potential usefulness in both revenue prediction and evaluation.

Keywords: hotel revenue, k-nearest-neighbors, machine learning, neural network, prediction model, regression tree, support vector machine

Procedia PDF Downloads 127
1312 The Prognostic Prediction Value of Positive Lymph Nodes Numbers for the Hypopharyngeal Squamous Cell Carcinoma

Authors: Wendu Pang, Yaxin Luo, Junhong Li, Yu Zhao, Danni Cheng, Yufang Rao, Minzi Mao, Ke Qiu, Yijun Dong, Fei Chen, Jun Liu, Jian Zou, Haiyang Wang, Wei Xu, Jianjun Ren

Abstract:

We aimed to compare the prognostic prediction value of positive lymph node number (PLNN) to the American Joint Committee on Cancer (AJCC) tumor, lymph node, and metastasis (TNM) staging system for patients with hypopharyngeal squamous cell carcinoma (HPSCC). A total of 826 patients with HPSCC from the Surveillance, Epidemiology, and End Results database (2004–2015) were identified and split into two independent cohorts: training (n=461) and validation (n=365). Univariate and multivariate Cox regression analyses were used to evaluate the prognostic effects of PLNN in patients with HPSCC. We further applied six Cox regression models to compare the survival predictive values of the PLNN and AJCC TNM staging system. PLNN showed a significant association with overall survival (OS) and cancer-specific survival (CSS) (P < 0.001) in both univariate and multivariable analyses, and was divided into three groups (PLNN 0, PLNN 1-5, and PLNN>5). In the training cohort, multivariate analysis revealed that the increased PLNN of HPSCC gave rise to significantly poor OS and CSS after adjusting for age, sex, tumor size, and cancer stage; this trend was also verified by the validation cohort. Additionally, the survival model incorporating a composite of PLNN and TNM classification (C-index, 0.705, 0.734) performed better than the PLNN and AJCC TNM models. PLNN can serve as a powerful survival predictor for patients with HPSCC and is a surrogate supplement for cancer staging systems.

Keywords: hypopharyngeal squamous cell carcinoma, positive lymph nodes number, prognosis, prediction models, survival predictive values

Procedia PDF Downloads 150
1311 Sarcasm Recognition System Using Hybrid Tone-Word Spotting Audio Mining Technique

Authors: Sandhya Baskaran, Hari Kumar Nagabushanam

Abstract:

Sarcasm sentiment recognition is an area of natural language processing that is being probed into in the recent times. Even with the advancements in NLP, typical translations of words, sentences in its context fail to provide the exact information on a sentiment or emotion of a user. For example, if something bad happens, the statement ‘That's just what I need, great! Terrific!’ is expressed in a sarcastic tone which could be misread as a positive sign by any text-based analyzer. In this paper, we are presenting a unique real time ‘word with its tone’ spotting technique which would provide the sentiment analysis for a tone or pitch of a voice in combination with the words being expressed. This hybrid approach increases the probability for identification of special sentiment like sarcasm much closer to the real world than by mining text or speech individually. The system uses a tone analyzer such as YIN-FFT which extracts pitch segment-wise that would be used in parallel with a speech recognition system. The clustered data is classified for sentiments and sarcasm score for each of it determined. Our Simulations demonstrates the improvement in f-measure of around 12% compared to existing detection techniques with increased precision and recall.

Keywords: sarcasm recognition, tone-word spotting, natural language processing, pitch analyzer

Procedia PDF Downloads 290
1310 Design and Implementation a Platform for Adaptive Online Learning Based on Fuzzy Logic

Authors: Budoor Al Abid

Abstract:

Educational systems are increasingly provided as open online services, providing guidance and support for individual learners. To adapt the learning systems, a proper evaluation must be made. This paper builds the evaluation model Fuzzy C Means Adaptive System (FCMAS) based on data mining techniques to assess the difficulty of the questions. The following steps are implemented; first using a dataset from an online international learning system called (slepemapy.cz) the dataset contains over 1300000 records with 9 features for students, questions and answers information with feedback evaluation. Next, a normalization process as preprocessing step was applied. Then FCM clustering algorithms are used to adaptive the difficulty of the questions. The result is three cluster labeled data depending on the higher Wight (easy, Intermediate, difficult). The FCM algorithm gives a label to all the questions one by one. Then Random Forest (RF) Classifier model is constructed on the clustered dataset uses 70% of the dataset for training and 30% for testing; the result of the model is a 99.9% accuracy rate. This approach improves the Adaptive E-learning system because it depends on the student behavior and gives accurate results in the evaluation process more than the evaluation system that depends on feedback only.

Keywords: machine learning, adaptive, fuzzy logic, data mining

Procedia PDF Downloads 192
1309 A Tutorial on Model Predictive Control for Spacecraft Maneuvering Problem with Theory, Experimentation and Applications

Authors: O. B. Iskender, K. V. Ling, V. Dubanchet, L. Simonini

Abstract:

This paper discusses the recent advances and future prospects of spacecraft position and attitude control using Model Predictive Control (MPC). First, the challenges of the space missions are summarized, in particular, taking into account the errors, uncertainties, and constraints imposed by the mission, spacecraft and, onboard processing capabilities. The summary of space mission errors and uncertainties provided in categories; initial condition errors, unmodeled disturbances, sensor, and actuator errors. These previous constraints are classified into two categories: physical and geometric constraints. Last, real-time implementation capability is discussed regarding the required computation time and the impact of sensor and actuator errors based on the Hardware-In-The-Loop (HIL) experiments. The rationales behind the scenarios’ are also presented in the scope of space applications as formation flying, attitude control, rendezvous and docking, rover steering, and precision landing. The objectives of these missions are explained, and the generic constrained MPC problem formulations are summarized. Three key design elements used in MPC design: the prediction model, the constraints formulation and the objective cost function are discussed. The prediction models can be linear time invariant or time varying depending on the geometry of the orbit, whether it is circular or elliptic. The constraints can be given as linear inequalities for input or output constraints, which can be written in the same form. Moreover, the recent convexification techniques for the non-convex geometrical constraints (i.e., plume impingement, Field-of-View (FOV)) are presented in detail. Next, different objectives are provided in a mathematical framework and explained accordingly. Thirdly, because MPC implementation relies on finding in real-time the solution to constrained optimization problems, computational aspects are also examined. In particular, high-speed implementation capabilities and HIL challenges are presented towards representative space avionics. This covers an analysis of future space processors as well as the requirements of sensors and actuators on the HIL experiments outputs. The HIL tests are investigated for kinematic and dynamic tests where robotic arms and floating robots are used respectively. Eventually, the proposed algorithms and experimental setups are introduced and compared with the authors' previous work and future plans. The paper concludes with a conjecture that MPC paradigm is a promising framework at the crossroads of space applications while could be further advanced based on the challenges mentioned throughout the paper and the unaddressed gap.

Keywords: convex optimization, model predictive control, rendezvous and docking, spacecraft autonomy

Procedia PDF Downloads 107
1308 Comparative Analysis of the Computer Methods' Usage for Calculation of Hydrocarbon Reserves in the Baltic Sea

Authors: Pavel Shcherban, Vlad Golovanov

Abstract:

Nowadays, the depletion of hydrocarbon deposits on the land of the Kaliningrad region leads to active geological exploration and development of oil and natural gas reserves in the southeastern part of the Baltic Sea. LLC 'Lukoil-Kaliningradmorneft' implements a comprehensive program for the development of the region's shelf in 2014-2023. Due to heterogeneity of reservoir rocks in various open fields, as well as with ambiguous conclusions on the contours of deposits, additional geological prospecting and refinement of the recoverable oil reserves are carried out. The key element is use of an effective technique of computer stock modeling at the first stage of processing of the received data. The following step uses information for the cluster analysis, which makes it possible to optimize the field development approaches. The article analyzes the effectiveness of various methods for reserves' calculation and computer modelling methods of the offshore hydrocarbon fields. Cluster analysis allows to measure influence of the obtained data on the development of a technical and economic model for mining deposits. The relationship between the accuracy of the calculation of recoverable reserves and the need of modernization of existing mining infrastructure, as well as the optimization of the scheme of opening and development of oil deposits, is observed.

Keywords: cluster analysis, computer modelling of deposits, correction of the feasibility study, offshore hydrocarbon fields

Procedia PDF Downloads 163
1307 Disrupted or Discounted Cash Flow: Impact of Digitisation on Business Valuation

Authors: Matthias Haerri, Tobias Huettche, Clemens Kustner

Abstract:

This article discusses the impact of digitization on business valuation. In order to become and remain ‘digital’, investments are necessary whose return on investment (ROI) often remains vague. This uncertainty is contradictory for a valuation, that rely on predictable cash flows, fixed capital structures and the steady state. However digitisation does not make a company valuation impossible, but traditional approaches must be reconsidered. The authors identify four areas that are to be changing: (1) Tools instead of intuition - In the future, company valuation will neither be art nor science, but craft. This does not require intuition, but experience and good tools. Digital evaluation tools beyond Excel will therefore gain in importance. (2) Real-time instead of deadline - At present, company valuations are always carried out on a case-by-case basis and on a specific key date. This will change with the digitalization and the introduction of web-based valuation tools. Company valuations can thus not only be carried out faster and more efficiently, but can also be offered more frequently. Instead of calculating the value for a previous key date, current and real-time valuations can be carried out. (3) Predictive planning instead of analysis of the past - Past data will also be needed in the future, but its use will not be limited to monovalent time series or key figure analyses. With pictures of ‘black swans’ and the ‘turkey illusion’ it was made clear to us that we build forecasts on too few data points of the past and underestimate the power of chance. Predictive planning can help here. (4) Convergence instead of residual value - Digital transformation shortens the lifespan of viable business models. If companies want to live forever, they have to change forever. For the company valuation, this means that the business model valid on the valuation date only has a limited service life.

Keywords: business valuation, corporate finance, digitisation, disruption

Procedia PDF Downloads 130
1306 Use of Locally Effective Microorganisms in Conjunction with Biochar to Remediate Mine-Impacted Soils

Authors: Thomas F. Ducey, Kristin M. Trippe, James A. Ippolito, Jeffrey M. Novak, Mark G. Johnson, Gilbert C. Sigua

Abstract:

The Oronogo-Duenweg mining belt –approximately 20 square miles around the Joplin, Missouri area– is a designated United States Environmental Protection Agency Superfund site due to lead-contaminated soil and groundwater by former mining and smelting operations. Over almost a century of mining (from 1848 to the late 1960’s), an estimated ten million tons of cadmium, lead, and zinc containing material have been deposited on approximately 9,000 acres. Sites that have undergone remediation, in which the O, A, and B horizons have been removed along with the lead contamination, the exposed C horizon remains incalcitrant to revegetation efforts. These sites also suffer from poor soil microbial activity, as measured by soil extracellular enzymatic assays, though 16S ribosomal ribonucleic acid (rRNA) indicates that microbial diversity is equal to sites that have avoided mine-related contamination. Soil analysis reveals low soil organic carbon, along with high levels of bio-available zinc, that reflect the poor soil fertility conditions and low microbial activity. Our study looked at the use of several materials to restore and remediate these sites, with the goal of improving soil health. The following materials, and their purposes for incorporation into the study, were as follows: manure-based biochar for the binding of zinc and other heavy metals responsible for phytotoxicity, locally sourced biosolids and compost to incorporate organic carbon into the depleted soils, effective microorganisms harvested from nearby pristine sites to provide a stable community for nutrient cycling in the newly composited 'soil material'. Our results indicate that all four materials used in conjunction result in the greatest benefit to these mine-impacted soils, based on above ground biomass, microbial biomass, and soil enzymatic activities.

Keywords: locally effective microorganisms, biochar, remediation, reclamation

Procedia PDF Downloads 214
1305 Impact of Transitioning to Renewable Energy Sources on Key Performance Indicators and Artificial Intelligence Modules of Data Center

Authors: Ahmed Hossam ElMolla, Mohamed Hatem Saleh, Hamza Mostafa, Lara Mamdouh, Yassin Wael

Abstract:

Artificial intelligence (AI) is reshaping industries, and its potential to revolutionize renewable energy and data center operations is immense. By harnessing AI's capabilities, we can optimize energy consumption, predict fluctuations in renewable energy generation, and improve the efficiency of data center infrastructure. This convergence of technologies promises a future where energy is managed more intelligently, sustainably, and cost-effectively. The integration of AI into renewable energy systems unlocks a wealth of opportunities. Machine learning algorithms can analyze vast amounts of data to forecast weather patterns, solar irradiance, and wind speeds, enabling more accurate energy production planning. AI-powered systems can optimize energy storage and grid management, ensuring a stable power supply even during intermittent renewable generation. Moreover, AI can identify maintenance needs for renewable energy infrastructure, preventing costly breakdowns and maximizing system lifespan. Data centers, which consume substantial amounts of energy, are prime candidates for AI-driven optimization. AI can analyze energy consumption patterns, identify inefficiencies, and recommend adjustments to cooling systems, server utilization, and power distribution. Predictive maintenance using AI can prevent equipment failures, reducing energy waste and downtime. Additionally, AI can optimize data placement and retrieval, minimizing energy consumption associated with data transfer. As AI transforms renewable energy and data center operations, modified Key Performance Indicators (KPIs) will emerge. Traditional metrics like energy efficiency and cost-per-megawatt-hour will continue to be relevant, but additional KPIs focused on AI's impact will be essential. These might include AI-driven cost savings, predictive accuracy of energy generation and consumption, and the reduction of carbon emissions attributed to AI-optimized operations. By tracking these KPIs, organizations can measure the success of their AI initiatives and identify areas for improvement. Ultimately, the synergy between AI, renewable energy, and data centers holds the potential to create a more sustainable and resilient future. By embracing these technologies, we can build smarter, greener, and more efficient systems that benefit both the environment and the economy.

Keywords: data center, artificial intelligence, renewable energy, energy efficiency, sustainability, optimization, predictive analytics, energy consumption, energy storage, grid management, data center optimization, key performance indicators, carbon emissions, resiliency

Procedia PDF Downloads 26
1304 Impacts of Social Support on Perceived Level of Stress and Self-Esteem among Students of Private Universities of Karachi-Pakistan

Authors: Sheeba Farhan

Abstract:

This study is conducted to explore the predictive relationship of perceived stress and self-esteem with social support of students and to explore the factors, which contribute to develop or enhance the level of stress in students of private universities in Karachi-Pakistan. After literature review following hypotheses were formulated; 1)social support would predict perceived stress of students of business administration of private organizations of Higher education, 2) social support would predict the self-esteem of students of private organizations of Higher education, 3) there will be a relationship of perceived stress and self-esteem of students of private organizations of Higher education, 4) there will be a relationship of self esteem and social support of students of private organizations of Higher education. Sample of the study is comprise of 100 students of private organizations of Higher education in Karachi- Pakistan (i.e. males= 50 & females= 50). The age range of participants is 18-26 years. The measures, used in the study are: Demographic information form, a semi structured interview form, Rosenberg self esteem scale (Rosenberg, 1965) and perceived stress scale (Cohen, Kamarck, and Mermelstein, 1983) and multidimensional scale of perceived social support (Zimet, 1988) Descriptive statistics is used for getting a better statistical view of characteristics of sample. Regression analysis is used to explore the predictive relationship of study related stress and self esteem with academic achievement of students of private organizations of Higher education. Percentages and ratios were calculated to explore the level of perceived stress with respect to Socio-demographic characteristics in students of private organizations of Higher education. Finding shows that social support is significantly associated with the higher level of self-esteem among students of graduation but insignificantly associated with stress that has been experienced by them. These results are correlated with a wide variety of studies in which social support has proposed to be a predictor of well being for the students.

Keywords: private universities of Karachi-Pakistan, Self-esteem, social support, stress

Procedia PDF Downloads 287
1303 Solar Power Generation in a Mining Town: A Case Study for Australia

Authors: Ryan Chalk, G. M. Shafiullah

Abstract:

Climate change is a pertinent issue facing governments and societies around the world. The industrial revolution has resulted in a steady increase in the average global temperature. The mining and energy production industries have been significant contributors to this change prompting government to intervene by promoting low emission technology within these sectors. This paper initially reviews the energy problem in Australia and the mining sector with a focus on the energy requirements and production methods utilised in Western Australia (WA). Renewable energy in the form of utility-scale solar photovoltaics (PV) provides a solution to these problems by providing emission-free energy which can be used to supplement the existing natural gas turbines in operation at the proposed site. This research presents a custom renewable solution for the mining site considering the specific township network, local weather conditions, and seasonal load profiles. A summary of the required PV output is presented to supply slightly over 50% of the towns power requirements during the peak (summer) period, resulting in close to full coverage in the trench (winter) period. Dig Silent Power Factory Software has been used to simulate the characteristics of the existing infrastructure and produces results of integrating PV. Large scale PV penetration in the network introduce technical challenges, that includes; voltage deviation, increased harmonic distortion, increased available fault current and power factor. Results also show that cloud cover has a dramatic and unpredictable effect on the output of a PV system. The preliminary analyses conclude that mitigation strategies are needed to overcome voltage deviations, unacceptable levels of harmonics, excessive fault current and low power factor. Mitigation strategies are proposed to control these issues predominantly through the use of high quality, made for purpose inverters. Results show that use of inverters with harmonic filtering reduces the level of harmonic injections to an acceptable level according to Australian standards. Furthermore, the configuration of inverters to supply active and reactive power assist in mitigating low power factor problems. Use of FACTS devices; SVC and STATCOM also reduces the harmonics and improve the power factor of the network, and finally, energy storage helps to smooth the power supply.

Keywords: climate change, mitigation strategies, photovoltaic (PV), power quality

Procedia PDF Downloads 163
1302 Tc-99m MIBI Scintigraphy to Differentiate Malignant from Benign Lesions, Detected on Planar Bone Scan

Authors: Aniqa Jabeen

Abstract:

The aim of this study was to evaluate the effectiveness of Tc-99m MIBI (Technetium 99-methoxy-iso-butyl-isonitrile) scintigraphy to differentiate malignancies from benign lesions, which were detected on planar bone scans. Materials and Methods: 59 patients with bone lesions were enrolled in the study. The scintigraphic findings were compared with the clinical, radiological and the histological findings. Each patient initially underwent a three-phase bone scan with Tc-99m MDP (Methylene Diphosphonate) and if evidence of lesion found, the patient then underwent a dynamic and static MIBI scintigraphy after three to four days. The MDP and MIBI scans were evaluated visually and quantitatively. For quantitative analysis count ratios of lesions and contralateral normal side (L/C) were taken by region of interests drawn on scans. The Student T test was applied to assess the significant difference between benign and malignant lesions p-value < 0.05 was considered significant. Result: The MDP scans showed the increase tracer uptake, but there was no significant difference between benign and malignant uptake of the radiotracer. However significant difference (p-value 0.015), in uptake was seen in malignant (L/C = 3.51 ± 1.02) and benign lesion (L/C = 2.50±0.42) on MIBI scan. Three of thirty benign lesions did not show significant MIBI uptake. Seven malignant appeared as false negatives. Specificity of the scan was 86.66%, and its Negative Predictive Value (NPV) was 81.25% whereas the sensitivity of scan was 79.31%. In excluding the axial metastasis from the lesions, the sensitivity of MIBI scan increased to 91.66% and the NPV also increased to 92.85%. Conclusion: MIBI scintigraphy provides its usefulness by distinguishing malignant from benign lesions. MIBI also correctly identifies metastatic lesions. The negative predictive value of the scan points towards its ability to accurately diagnose the normal (benign) cases. However, biopsy remains the gold standard and a definitive diagnostic modality in musculoskeletal tumors. MIBI scan provides useful information in preoperative assessment and in distinguishing between malignant and benign lesions.

Keywords: benign, malignancies, MDP bone scan, MIBI scintigraphy

Procedia PDF Downloads 400
1301 Discerning Divergent Nodes in Social Networks

Authors: Mehran Asadi, Afrand Agah

Abstract:

In data mining, partitioning is used as a fundamental tool for classification. With the help of partitioning, we study the structure of data, which allows us to envision decision rules, which can be applied to classification trees. In this research, we used online social network dataset and all of its attributes (e.g., Node features, labels, etc.) to determine what constitutes an above average chance of being a divergent node. We used the R statistical computing language to conduct the analyses in this report. The data were found on the UC Irvine Machine Learning Repository. This research introduces the basic concepts of classification in online social networks. In this work, we utilize overfitting and describe different approaches for evaluation and performance comparison of different classification methods. In classification, the main objective is to categorize different items and assign them into different groups based on their properties and similarities. In data mining, recursive partitioning is being utilized to probe the structure of a data set, which allow us to envision decision rules and apply them to classify data into several groups. Estimating densities is hard, especially in high dimensions, with limited data. Of course, we do not know the densities, but we could estimate them using classical techniques. First, we calculated the correlation matrix of the dataset to see if any predictors are highly correlated with one another. By calculating the correlation coefficients for the predictor variables, we see that density is strongly correlated with transitivity. We initialized a data frame to easily compare the quality of the result classification methods and utilized decision trees (with k-fold cross validation to prune the tree). The method performed on this dataset is decision trees. Decision tree is a non-parametric classification method, which uses a set of rules to predict that each observation belongs to the most commonly occurring class label of the training data. Our method aggregates many decision trees to create an optimized model that is not susceptible to overfitting. When using a decision tree, however, it is important to use cross-validation to prune the tree in order to narrow it down to the most important variables.

Keywords: online social networks, data mining, social cloud computing, interaction and collaboration

Procedia PDF Downloads 153
1300 A Methodology for Developing New Technology Ideas to Avoid Patent Infringement: F-Term Based Patent Analysis

Authors: Kisik Song, Sungjoo Lee

Abstract:

With the growing importance of intangible assets recently, the impact of patent infringement on the business of a company has become more evident. Accordingly, it is essential for firms to estimate the risk of patent infringement risk before developing a technology and create new technology ideas to avoid the risk. Recognizing the needs, several attempts have been made to help develop new technology opportunities and most of them have focused on identifying emerging vacant technologies from patent analysis. In these studies, the IPC (International Patent Classification) system or keywords from text-mining application to patent documents was generally used to define vacant technologies. Unlike those studies, this study adopted F-term, which classifies patent documents according to the technical features of the inventions described in them. Since the technical features are analyzed by various perspectives by F-term, F-term provides more detailed information about technologies compared to IPC while more systematic information compared to keywords. Therefore, if well utilized, it can be a useful guideline to create a new technology idea. Recognizing the potential of F-term, this paper aims to suggest a novel approach to developing new technology ideas to avoid patent infringement based on F-term. For this purpose, we firstly collected data about F-term and then applied text-mining to the descriptions about classification criteria and attributes. From the text-mining results, we could identify other technologies with similar technical features of the existing one, the patented technology. Finally, we compare the technologies and extract the technical features that are commonly used in other technologies but have not been used in the existing one. These features are presented in terms of “purpose”, “function”, “structure”, “material”, “method”, “processing and operation procedure” and “control means” and so are useful for creating new technology ideas that help avoid infringing patent rights of other companies. Theoretically, this is one of the earliest attempts to adopt F-term to patent analysis; the proposed methodology can show how to best take advantage of F-term with the wealth of technical information. In practice, the proposed methodology can be valuable in the ideation process for successful product and service innovation without infringing the patents of other companies.

Keywords: patent infringement, new technology ideas, patent analysis, F-term

Procedia PDF Downloads 263