Commenced in January 2007

Frequency: Monthly

Edition: International

Paper Count: 456

Search results for: streaming analytics

276 A Guide to the Implementation of Ambisonics Super Stereo

Authors: Alessio Mastrorillo, Giuseppe Silvi, Francesco Scagliola

Abstract:

In this work, we introduce an Ambisonics decoder with an implementation of the C-format, also called Super Stereo. This format is an alternative to conventional stereo and binaural decoding. Unlike those, this format conveys audio information from the horizontal plane and works with stereo speakers and headphones. The two C-format channels can also return a reconstructed planar B-format. This work provides an open-source implementation for this format. We implement an all-pass filter for signal quadrature, as required by the decoding equations. This filter works with six Biquads in a cascade configuration, with values for control frequency and quality factor discovered experimentally. The phase response of the filter delivers a small error in the 20-14.000Hz range. The decoder has been tested with audio sources up to 192kHz sample rate, returning pristine sound quality and detailed stereo image. It has been included in the Envelop for Live suite and is available as an open-source repository. This decoder has applications in Virtual Reality and 360° audio productions, music composition, and online streaming.

Keywords: ambisonics, UHJ, quadrature filter, virtual reality, Gerzon, decoder, stereo, binaural, biquad

Procedia PDF Downloads 91

275 Quality Assurance in Cardiac Disorder Detection Images

Authors: Anam Naveed, Asma Andleeb, Mehreen Sirshar

Abstract:

In the article, Image processing techniques have been applied on cardiac images for enhancing the image quality. Two types of methodologies considers for survey, invasive techniques and non-invasive techniques. Different image processes for improvement of cardiac image quality and reduce the amount of radiation exposure for invasive techniques are explored. Different image processing algorithms for enhancing the noninvasive cardiac image qualities are described. Beside these two methodologies, third methodology has applied on live streaming of heart rate on ECG window for extracting necessary information, removing noise and enhancing quality. Sensitivity analyses have been carried out to investigate the impacts of cardiac images for diagnosis of cardiac arteries disease and how the enhancement on images will help the cardiologist to diagnoses disease. The paper evaluates strengths and weaknesses of different techniques applied for improved the image quality and draw a conclusion. Some specific limitations must be considered for whole survey, like the patient heart beat must be 70-75 beats/minute while doing the angiography, similarly patient weight and exposure radiation amount has some limitation.

Keywords: cardiac images, CT angiography, critical analysis, exposure radiation, invasive techniques, invasive techniques, non-invasive techniques

Procedia PDF Downloads 352

274 Cyber Security Enhancement via Software Defined Pseudo-Random Private IP Address Hopping

Authors: Andre Slonopas, Zona Kostic, Warren Thompson

Abstract:

Obfuscation is one of the most useful tools to prevent network compromise. Previous research focused on the obfuscation of the network communications between external-facing edge devices. This work proposes the use of two edge devices, external and internal facing, which communicate via private IPv4 addresses in a software-defined pseudo-random IP hopping. This methodology does not require additional IP addresses and/or resources to implement. Statistical analyses demonstrate that the hopping surface must be at least 1e3 IP addresses in size with a broad standard deviation to minimize the possibility of coincidence of monitored and communication IPs. The probability of breaking the hopping algorithm requires a collection of at least 1e6 samples, which for large hopping surfaces will take years to collect. The probability of dropped packets is controlled via memory buffers and the frequency of hops and can be reduced to levels acceptable for video streaming. This methodology provides an impenetrable layer of security ideal for information and supervisory control and data acquisition systems.

Keywords: moving target defense, cybersecurity, network security, hopping randomization, software defined network, network security theory

Procedia PDF Downloads 187

273 Examining the Relationship Between Traditional Property Rights and Online Intellectual Property Rights in the Digital Age

Authors: Luljeta Plakolli-Kasumi

Abstract:

In the digital age, the relationship between traditional property rights and online intellectual property rights is becoming increasingly complex. On the one hand, the internet and advancements in technology have allowed for the widespread distribution and use of digital content, making it easier for individuals and businesses to access and share information. On the other hand, the rise of digital piracy and illegal file-sharing has led to increased concerns about the protection of intellectual property rights. This paper aims to examine the relationship between traditional property rights and online intellectual property rights in the digital age by analyzing the current legal frameworks, key challenges and controversies that arise, and potential solutions for addressing these issues. The paper will look at how traditional property rights concepts such as ownership and possession are being applied in the online context and how they intersect with new and evolving forms of intellectual property such as digital downloads, streaming services, and online content creation. It will also discuss the tension between the need for strong intellectual property protection to encourage creativity and innovation and the public interest in promoting access to information and knowledge. Ultimately, the paper will explore how the legal system can adapt to better balance the interests of property owners, creators, and users in the digital age.

Keywords: intellectual property, traditional property, digital age, digital content

Procedia PDF Downloads 92

272 Myers-Briggs Type Index Personality Type Classification Based on an Individual’s Spotify Playlists

Authors: Sefik Can Karakaya, Ibrahim Demir

Abstract:

In this study, the relationship between musical preferences and personality traits has been investigated in terms of Spotify audio analysis features. The aim of this paper is to build such a classifier capable of segmenting people into their Myers-Briggs Type Index (MBTI) personality type based on their Spotify playlists. Music takes an important place in the lives of people all over the world and online music streaming platforms make it easier to reach musical contents. In this context, the motivation to build such a classifier is allowing people to gain access to their MBTI personality type and perhaps for more reliably and more quickly. For this purpose, logistic regression and deep neural networks have been selected for classifier and their performances are compared. In conclusion, it has been found that musical preferences differ statistically between personality traits, and evaluated models are able to distinguish personality types based on given musical data structure with over %60 accuracy rate.

Keywords: myers-briggs type indicator, music psychology, Spotify, behavioural user profiling, deep neural networks, logistic regression

Procedia PDF Downloads 145

271 A Study of Predicting Judgments on Causes of Online Privacy Invasions: Based on U.S Judicial Cases

Authors: Minjung Park, Sangmi Chai, Myoung Jun Lee

Abstract:

Since there are growing concerns on online privacy, enterprises could involve various personal privacy infringements cases resulting legal causations. For companies that are involving online business, it is important for them to pay extra attentions to protect users’ privacy. If firms can aware consequences from possible online privacy invasion cases, they can more actively prevent future online privacy infringements. This study attempts to predict the probability of ruling types caused by various invasion cases under U.S Personal Privacy Act. More specifically, this research explores online privacy invasion cases which was sentenced guilty to identify types of criminal punishments such as penalty, imprisonment, probation as well as compensation in civil cases. Based on the 853 U.S judicial cases ranged from January, 2000 to May, 2016, which related on data privacy, this research examines the relationship between personal information infringements cases and adjudications. Upon analysis results of 41,724 words extracted from 853 regal cases, this study examined online users’ privacy invasion cases to predict the probability of conviction for a firm as an offender in both of criminal and civil law. This research specifically examines that a cause of privacy infringements and a judgment type, whether it leads a civil or criminal liability, from U.S court. This study applies network text analysis (NTA) for data analysis, which is regarded as a useful method to discover embedded social trends within texts. According to our research results, certain online privacy infringement cases caused by online spamming and adware have a high possibility that firms are liable in the case. Our research results provide meaningful insights to academia as well as industry. First, our study is providing a new insight by applying Big Data analytics to legal cases so that it can predict the cause of invasions and legal consequences. Since there are few researches applying big data analytics in the domain of law, specifically in online privacy, this study suggests new area that future studies can explore. Secondly, this study reflects social influences, such as a development of privacy invasion technologies and changes of users’ level of awareness of online privacy on judicial cases analysis by adopting NTA method. Our research results indicate that firms need to improve technical and managerial systems to protect users’ online privacy to avoid negative legal consequences.

Keywords: network text analysis, online privacy invasions, personal information infringements, predicting judgements

Procedia PDF Downloads 229

270 Microclimate Variations in Rio de Janeiro Related to Massive Public Transportation

Authors: Marco E. O. Jardim, Frederico A. M. Souza, Valeria M. Bastos, Myrian C. A. Costa, Nelson F. F. Ebecken

Abstract:

Urban public transportation in Rio de Janeiro is based on bus lines, powered by diesel, and four limited metro lines that support only some neighborhoods. This work presents an infrastructure built to better understand microclimate variations related to massive urban transportation in some specific areas of the city. The use of sensor nodes with small analytics capacity provides environmental information to population or public services. The analyses of data collected from a few small sensors positioned near some heavy traffic streets show the harmful impact due to poor bus route plan.

Keywords: big data, IoT, public transportation, public health system

Procedia PDF Downloads 255

269 Quantum Decision Making with Small Sample for Network Monitoring and Control

Authors: Tatsuya Otoshi, Masayuki Murata

Abstract:

With the development and diversification of applications on the Internet, applications that require high responsiveness, such as video streaming, are becoming mainstream. Application responsiveness is not only a matter of communication delay but also a matter of time required to grasp changes in network conditions. The tradeoff between accuracy and measurement time is a challenge in network control. We people make countless decisions all the time, and our decisions seem to resolve tradeoffs between time and accuracy. When making decisions, people are known to make appropriate choices based on relatively small samples. Although there have been various studies on models of human decision-making, a model that integrates various cognitive biases, called ”quantum decision-making,” has recently attracted much attention. However, the modeling of small samples has not been examined much so far. In this paper, we extend the model of quantum decision-making to model decision-making with a small sample. In the proposed model, the state is updated by value-based probability amplitude amplification. By analytically obtaining a lower bound on the number of samples required for decision-making, we show that decision-making with a small number of samples is feasible.

Keywords: quantum decision making, small sample, MPEG-DASH, Grover's algorithm

Procedia PDF Downloads 80

268 An Analysis of Privacy and Security for Internet of Things Applications

Authors: Dhananjay Singh, M. Abdullah-Al-Wadud

Abstract:

The Internet of Things is a concept of a large scale ecosystem of wireless actuators. The actuators are defined as things in the IoT, those which contribute or produces some data to the ecosystem. However, ubiquitous data collection, data security, privacy preserving, large volume data processing, and intelligent analytics are some of the key challenges into the IoT technologies. In order to solve the security requirements, challenges and threats in the IoT, we have discussed a message authentication mechanism for IoT applications. Finally, we have discussed data encryption mechanism for messages authentication before propagating into IoT networks.

Keywords: Internet of Things (IoT), message authentication, privacy, security

Procedia PDF Downloads 384

267 Real-Time Network Anomaly Detection Systems Based on Machine-Learning Algorithms

Authors: Zahra Ramezanpanah, Joachim Carvallo, Aurelien Rodriguez

Abstract:

This paper aims to detect anomalies in streaming data using machine learning algorithms. In this regard, we designed two separate pipelines and evaluated the effectiveness of each separately. The first pipeline, based on supervised machine learning methods, consists of two phases. In the first phase, we trained several supervised models using the UNSW-NB15 data-set. We measured the efficiency of each using different performance metrics and selected the best model for the second phase. At the beginning of the second phase, we first, using Argus Server, sniffed a local area network. Several types of attacks were simulated and then sent the sniffed data to a running algorithm at short intervals. This algorithm can display the results of each packet of received data in real-time using the trained model. The second pipeline presented in this paper is based on unsupervised algorithms, in which a Temporal Graph Network (TGN) is used to monitor a local network. The TGN is trained to predict the probability of future states of the network based on its past behavior. Our contribution in this section is introducing an indicator to identify anomalies from these predicted probabilities.

Keywords: temporal graph network, anomaly detection, cyber security, IDS

Procedia PDF Downloads 103

266 Local Binary Patterns-Based Statistical Data Analysis for Accurate Soccer Match Prediction

Authors: Mohammad Ghahramani, Fahimeh Saei Manesh

Abstract:

Winning a soccer game is based on thorough and deep analysis of the ongoing match. On the other hand, giant gambling companies are in vital need of such analysis to reduce their loss against their customers. In this research work, we perform deep, real-time analysis on every soccer match around the world that distinguishes our work from others by focusing on particular seasons, teams and partial analytics. Our contributions are presented in the platform called “Analyst Masters.” First, we introduce various sources of information available for soccer analysis for teams around the world that helped us record live statistical data and information from more than 50,000 soccer matches a year. Our second and main contribution is to introduce our proposed in-play performance evaluation. The third contribution is developing new features from stable soccer matches. The statistics of soccer matches and their odds before and in-play are considered in the image format versus time including the halftime. Local Binary patterns, (LBP) is then employed to extract features from the image. Our analyses reveal incredibly interesting features and rules if a soccer match has reached enough stability. For example, our “8-minute rule” implies if 'Team A' scores a goal and can maintain the result for at least 8 minutes then the match would end in their favor in a stable match. We could also make accurate predictions before the match of scoring less/more than 2.5 goals. We benefit from the Gradient Boosting Trees, GBT, to extract highly related features. Once the features are selected from this pool of data, the Decision trees decide if the match is stable. A stable match is then passed to a post-processing stage to check its properties such as betters’ and punters’ behavior and its statistical data to issue the prediction. The proposed method was trained using 140,000 soccer matches and tested on more than 100,000 samples achieving 98% accuracy to select stable matches. Our database from 240,000 matches shows that one can get over 20% betting profit per month using Analyst Masters. Such consistent profit outperforms human experts and shows the inefficiency of the betting market. Top soccer tipsters achieve 50% accuracy and 8% monthly profit in average only on regional matches. Both our collected database of more than 240,000 soccer matches from 2012 and our algorithm would greatly benefit coaches and punters to get accurate analysis.

Keywords: soccer, analytics, machine learning, database

Procedia PDF Downloads 239

265 A Technique for Image Segmentation Using K-Means Clustering Classification

Authors: Sadia Basar, Naila Habib, Awais Adnan

Abstract:

The paper presents the Technique for Image Segmentation Using K-Means Clustering Classification. The presented algorithms were specific, however, missed the neighboring information and required high-speed computerized machines to run the segmentation algorithms. Clustering is the process of partitioning a group of data points into a small number of clusters. The proposed method is content-aware and feature extraction method which is able to run on low-end computerized machines, simple algorithm, required low-quality streaming, efficient and used for security purpose. It has the capability to highlight the boundary and the object. At first, the user enters the data in the representation of the input. Then in the next step, the digital image is converted into groups clusters. Clusters are divided into many regions. The same categories with same features of clusters are assembled within a group and different clusters are placed in other groups. Finally, the clusters are combined with respect to similar features and then represented in the form of segments. The clustered image depicts the clear representation of the digital image in order to highlight the regions and boundaries of the image. At last, the final image is presented in the form of segments. All colors of the image are separated in clusters.

Keywords: clustering, image segmentation, K-means function, local and global minimum, region

Procedia PDF Downloads 376

264 Correlation Mapping for Measuring Platelet Adhesion

Authors: Eunseop Yeom

Abstract:

Platelets can be activated by the surrounding blood flows where a blood vessel is narrowed as a result of atherosclerosis. Numerous studies have been conducted to identify the relation between platelets activation and thrombus formation. To measure platelet adhesion, this study proposes an image analysis technique. Blood samples are delivered in the microfluidic channel, and then platelets are activated by a stenotic micro-channel with 90% severity. By applying proposed correlation mapping, which visualizes decorrelation of the streaming blood flow, the area of adhered platelets (APlatelet) was estimated without labeling platelets. In order to evaluate the performance of correlation mapping on the detection of platelet adhesion, the effect of tile size was investigated by calculating 2D correlation coefficients with binary images obtained by manual labeling and the correlation mapping method with different sizes of the square tile ranging from 3 to 50 pixels. The maximum 2D correlation coefficient is observed with the optimum tile size of 5×5 pixels. As the area of the platelet adhesion increases, the platelets plug the channel and there is only a small amount of blood flows. This image analysis could provide new insights for better understanding of the interactions between platelet aggregation and blood flows in various physiological conditions.

Keywords: platelet activation, correlation coefficient, image analysis, shear rate

Procedia PDF Downloads 335

263 Signal Strength Based Multipath Routing for Mobile Ad Hoc Networks

Authors: Chothmal

Abstract:

In this paper, we present a route discovery process which uses the signal strength on a link as a parameter of its inclusion in the route discovery method. The proposed signal-to-interference and noise ratio (SINR) based multipath reactive routing protocol is named as SINR-MP protocol. The proposed SINR-MP routing protocols has two following two features: a) SINR-MP protocol selects routes based on the SINR of the links during the route discovery process therefore it select the routes which has long lifetime and low frame error rate for data transmission, and b) SINR-MP protocols route discovery process is multipath which discovers more than one SINR based route between a given source destination pair. The multiple routes selected by our SINR-MP protocol are node-disjoint in nature which increases their robustness against link failures, as failure of one route will not affect the other route. The secondary route is very useful in situations where the primary route is broken because we can now use the secondary route without causing a new route discovery process. Due to this, the network overhead caused by a route discovery process is avoided. This increases the network performance greatly. The proposed SINR-MP routing protocol is implemented in the trail version of network simulator called Qualnet.

Keywords: ad hoc networks, quality of service, video streaming, H.264/SVC, multiple routes, video traces

Procedia PDF Downloads 250

262 The Role of Technology in Transforming the Finance, Banking, and Insurance Sectors

Authors: Farid Fahami

Abstract:

This article explores the transformative role of technology in the finance, banking, and insurance sectors. It examines key technological trends such as AI, blockchain, data analytics, and digital platforms and their impact on operations, customer experiences, and business models. The article highlights the benefits of technology adoption, including improved efficiency, cost reduction, enhanced customer experiences, and expanded financial inclusion. It also addresses challenges like cybersecurity, data privacy, and the need for upskilling. Real-world case studies demonstrate successful technology integration, and recommendations for stakeholders emphasize embracing innovation and collaboration. The article concludes by emphasizing the importance of technology in shaping the future of these sectors.

Keywords: banking, finance, insurance, technology

Procedia PDF Downloads 72

261 Integrating Machine Learning and Rule-Based Decision Models for Enhanced B2B Sales Forecasting and Customer Prioritization

Authors: Wenqi Liu, Reginald Bailey

Abstract:

This study proposes a comprehensive and effective approach to business-to-business (B2B) sales forecasting by integrating advanced machine learning models with a rule-based decision-making framework. The methodology addresses the critical challenge of optimizing sales pipeline performance and improving conversion rates through predictive analytics and actionable insights. The first component involves developing a classification model to predict the likelihood of conversion, aiming to outperform traditional methods such as logistic regression in terms of accuracy, precision, recall, and F1 score. Feature importance analysis highlights key predictive factors, such as client revenue size and sales velocity, providing valuable insights into conversion dynamics. The second component focuses on forecasting sales value using a regression model, designed to achieve superior performance compared to linear regression by minimizing mean absolute error (MAE), mean squared error (MSE), and maximizing R-squared metrics. The regression analysis identifies primary drivers of sales value, further informing data-driven strategies. To bridge the gap between predictive modeling and actionable outcomes, a rule-based decision framework is introduced. This model categorizes leads into high, medium, and low priorities based on thresholds for conversion probability and predicted sales value. By combining classification and regression outputs, this framework enables sales teams to allocate resources effectively, focus on high-value opportunities, and streamline lead management processes. The integrated approach significantly enhances lead prioritization, increases conversion rates, and drives revenue generation, offering a robust solution to the declining pipeline conversion rates faced by many B2B organizations. Our findings demonstrate the practical benefits of blending machine learning with decision-making frameworks, providing a scalable, data-driven solution for strategic sales optimization. This study underscores the potential of predictive analytics to transform B2B sales operations, enabling more informed decision-making and improved organizational outcomes in competitive markets.

Keywords: machine learning, XGBoost, regression, decision making framework, system engineering

Procedia PDF Downloads 25

260 The Synergistic Effects of Blockchain and AI on Enhancing Data Integrity and Decision-Making Accuracy in Smart Contracts

Authors: Sayor Ajfar Aaron, Sajjat Hossain Abir, Ashif Newaz, Mushfiqur Rahman

Abstract:

Investigating the convergence of blockchain technology and artificial intelligence, this paper examines their synergistic effects on data integrity and decision-making within smart contracts. By implementing AI-driven analytics on blockchain-based platforms, the research identifies improvements in automated contract enforcement and decision accuracy. The paper presents a framework that leverages AI to enhance transparency and trust while blockchain ensures immutable record-keeping, culminating in significantly optimized operational efficiencies in various industries.

Keywords: artificial intelligence, blockchain, data integrity, smart contracts

Procedia PDF Downloads 59

259 Entropy Risk Factor Model of Exchange Rate Prediction

Authors: Darrol Stanley, Levan Efremidze, Jannie Rossouw

Abstract:

We investigate the predictability of the USD/ZAR (South African Rand) exchange rate with sample entropy analytics for the period of 2004-2015. We calculate sample entropy based on the daily data of the exchange rate and conduct empirical implementation of several market timing rules based on these entropy signals. The dynamic investment portfolio based on entropy signals produces better risk adjusted performance than a buy and hold strategy. The returns are estimated on the portfolio values in U.S. dollars. These results are preliminary and do not yet account for reasonable transactions costs, although these are very small in currency markets.

Keywords: currency trading, entropy, market timing, risk factor model

Procedia PDF Downloads 271

258 Predicting the Success of Bank Telemarketing Using Artificial Neural Network

Authors: Mokrane Selma

Abstract:

The shift towards decision making (DM) based on artificial intelligence (AI) techniques will change the way in which consumer markets and our societies function. Through AI, predictive analytics is being used by businesses to identify these patterns and major trends with the objective to improve the DM and influence future business outcomes. This paper proposes an Artificial Neural Network (ANN) approach to predict the success of telemarketing calls for selling bank long-term deposits. To validate the proposed model, we uses the bank marketing data of 41188 phone calls. The ANN attains 98.93% of accuracy which outperforms other conventional classifiers and confirms that it is credible and valuable approach for telemarketing campaign managers.

Keywords: bank telemarketing, prediction, decision making, artificial intelligence, artificial neural network

Procedia PDF Downloads 160

257 Social Data Aggregator and Locator of Knowledge (STALK)

Authors: Rashmi Raghunandan, Sanjana Shankar, Rakshitha K. Bhat

Abstract:

Social media contributes a vast amount of data and information about individuals to the internet. This project will greatly reduce the need for unnecessary manual analysis of large and diverse social media profiles by filtering out and combining the useful information from various social media profiles, eliminating irrelevant data. It differs from the existing social media aggregators in that it does not provide a consolidated view of various profiles. Instead, it provides consolidated INFORMATION derived from the subject’s posts and other activities. It also allows analysis over multiple profiles and analytics based on several profiles. We strive to provide a query system to provide a natural language answer to questions when a user does not wish to go through the entire profile. The information provided can be filtered according to the different use cases it is used for.

Keywords: social network, analysis, Facebook, Linkedin, git, big data

Procedia PDF Downloads 444

256 Big Data Strategy for Telco: Network Transformation

Authors: F. Amin, S. Feizi

Abstract:

Big data has the potential to improve the quality of services; enable infrastructure that businesses depend on to adapt continually and efficiently; improve the performance of employees; help organizations better understand customers; and reduce liability risks. Analytics and marketing models of fixed and mobile operators are falling short in combating churn and declining revenue per user. Big Data presents new method to reverse the way and improve profitability. The benefits of Big Data and next-generation network, however, are more exorbitant than improved customer relationship management. Next generation of networks are in a prime position to monetize rich supplies of customer information—while being mindful of legal and privacy issues. As data assets are transformed into new revenue streams will become integral to high performance.

Keywords: big data, next generation networks, network transformation, strategy

Procedia PDF Downloads 361

255 The Relevance of Smart Technologies in Learning

Authors: Rachael Olubukola Afolabi

Abstract:

Immersive technologies known as X Reality or Cross Reality that include virtual reality augmented reality, and mixed reality have pervaded into the education system at different levels from elementary school to adult learning. Instructors, instructional designers, and learning experience specialists continue to find new ways to engage students in the learning process using technology. While the progression of web technologies has enhanced digital learning experiences, analytics on learning outcomes continue to be explored to determine the relevance of these technologies in learning. Digital learning has evolved from web 1.0 (static) to 4.0 (dynamic and interactive), and this evolution of technologies has also advanced teaching methods and approaches. This paper explores how these technologies are being utilized in learning and the results that educators and learners have identified as effective learning opportunities and approaches.

Keywords: immersive technologoes, virtual reality, augmented reality, technology in learning

Procedia PDF Downloads 145

254 Control the Flow of Big Data

Authors: Shizra Waris, Saleem Akhtar

Abstract:

Big data is a research area receiving attention from academia and IT communities. In the digital world, the amounts of data produced and stored have within a short period of time. Consequently this fast increasing rate of data has created many challenges. In this paper, we use functionalism and structuralism paradigms to analyze the genesis of big data applications and its current trends. This paper presents a complete discussion on state-of-the-art big data technologies based on group and stream data processing. Moreover, strengths and weaknesses of these technologies are analyzed. This study also covers big data analytics techniques, processing methods, some reported case studies from different vendor, several open research challenges and the chances brought about by big data. The similarities and differences of these techniques and technologies based on important limitations are also investigated. Emerging technologies are suggested as a solution for big data problems.

Keywords: computer, it community, industry, big data

Procedia PDF Downloads 194

253 Enhancing Large Language Models' Data Analysis Capability with Planning-and-Execution and Code Generation Agents: A Use Case for Southeast Asia Real Estate Market Analytics

Authors: Kien Vu, Jien Min Soh, Mohamed Jahangir Abubacker, Piyawut Pattamanon, Soojin Lee, Suvro Banerjee

Abstract:

Recent advances in Generative Artificial Intelligence (GenAI), in particular Large Language Models (LLMs) have shown promise to disrupt multiple industries at scale. However, LLMs also present unique challenges, notably, these so-called "hallucination" which is the generation of outputs that are not grounded in the input data that hinders its adoption into production. Common practice to mitigate hallucination problem is utilizing Retrieval Agmented Generation (RAG) system to ground LLMs'response to ground truth. RAG converts the grounding documents into embeddings, retrieve the relevant parts with vector similarity between user's query and documents, then generates a response that is not only based on its pre-trained knowledge but also on the specific information from the retrieved documents. However, the RAG system is not suitable for tabular data and subsequent data analysis tasks due to multiple reasons such as information loss, data format, and retrieval mechanism. In this study, we have explored a novel methodology that combines planning-and-execution and code generation agents to enhance LLMs' data analysis capabilities. The approach enables LLMs to autonomously dissect a complex analytical task into simpler sub-tasks and requirements, then convert them into executable segments of code. In the final step, it generates the complete response from output of the executed code. When deployed beta version on DataSense, the property insight tool of PropertyGuru, the approach yielded promising results, as it was able to provide market insights and data visualization needs with high accuracy and extensive coverage by abstracting the complexities for real-estate agents and developers from non-programming background. In essence, the methodology not only refines the analytical process but also serves as a strategic tool for real estate professionals, aiding in market understanding and enhancement without the need for programming skills. The implication extends beyond immediate analytics, paving the way for a new era in the real estate industry characterized by efficiency and advanced data utilization.

Keywords: large language model, reasoning, planning and execution, code generation, natural language processing, prompt engineering, data analysis, real estate, data sense, PropertyGuru

Procedia PDF Downloads 88

252 The Use of Rule-Based Cellular Automata to Track and Forecast the Dispersal of Classical Biocontrol Agents at Scale, with an Application to the Fopius arisanus Fruit Fly Parasitoid

Authors: Agboka Komi Mensah, John Odindi, Elfatih M. Abdel-Rahman, Onisimo Mutanga, Henri Ez Tonnang

Abstract:

Ecosystems are networks of organisms and populations that form a community of various species interacting within their habitats. Such habitats are defined by abiotic and biotic conditions that establish the initial limits to a population's growth, development, and reproduction. The habitat’s conditions explain the context in which species interact to access resources such as food, water, space, shelter, and mates, allowing for feeding, dispersal, and reproduction. Dispersal is an essential life-history strategy that affects gene flow, resource competition, population dynamics, and species distributions. Despite the importance of dispersal in population dynamics and survival, understanding the mechanism underpinning the dispersal of organisms remains challenging. For instance, when an organism moves into an ecosystem for survival and resource competition, its progression is highly influenced by extrinsic factors such as its physiological state, climatic variables and ability to evade predation. Therefore, greater spatial detail is necessary to understand organism dispersal dynamics. Understanding organisms dispersal can be addressed using empirical and mechanistic modelling approaches, with the adopted approach depending on the study's purpose Cellular automata (CA) is an example of these approaches that have been successfully used in biological studies to analyze the dispersal of living organisms. Cellular automata can be briefly described as occupied cells by an individual that evolves based on proper decisions based on a set of neighbours' rules. However, in the ambit of modelling individual organisms dispersal at the landscape scale, we lack user friendly tools that do not require expertise in mathematical models and computing ability; such as a visual analytics framework for tracking and forecasting the dispersal behaviour of organisms. The term "visual analytics" (VA) describes a semiautomated approach to electronic data processing that is guided by users who can interact with data via an interface. Essentially, VA converts large amounts of quantitative or qualitative data into graphical formats that can be customized based on the operator's needs. Additionally, this approach can be used to enhance the ability of users from various backgrounds to understand data, communicate results, and disseminate information across a wide range of disciplines. To support effective analysis of the dispersal of organisms at the landscape scale, we therefore designed Pydisp which is a free visual data analytics tool for spatiotemporal dispersal modeling built in Python. Its user interface allows users to perform a quick and interactive spatiotemporal analysis of species dispersal using bioecological and climatic data. Pydisp enables reuse and upgrade through the use of simple principles such as Fuzzy cellular automata algorithms. The potential of dispersal modeling is demonstrated in a case study by predicting the dispersal of Fopius arisanus (Sonan), endoparasitoids to control Bactrocera dorsalis (Hendel) (Diptera: Tephritidae) in Kenya. The results obtained from our example clearly illustrate the parasitoid's dispersal process at the landscape level and confirm that dynamic processes in an agroecosystem are better understood when designed using mechanistic modelling approaches. Furthermore, as demonstrated in the example, the built software is highly effective in portraying the dispersal of organisms despite the unavailability of detailed data on the species dispersal mechanisms.

Keywords: cellular automata, fuzzy logic, landscape, spatiotemporal

Procedia PDF Downloads 79

251 Blame Classification through N-Grams in E-Commerce Customer Reviews

Authors: Subhadeep Mandal, Sujoy Bhattacharya, Pabitra Mitra, Diya Guha Roy, Seema Bhattacharya

Abstract:

E-commerce firms allow customers to evaluate and review the things they buy as a positive or bad experience. The e-commerce transaction processes are made up of a variety of diverse organizations and activities that operate independently but are connected together to complete the transaction (from placing an order to the goods reaching the client). After a negative shopping experience, clients frequently disregard the critical assessment of these businesses and submit their feedback on an all-over basis, which benefits certain enterprises but is tedious for others. In this article, we solely dealt with negative reviews and attempted to distinguish between negative reviews where the e-commerce firm is explicitly blamed by customers for a bad purchasing experience and other negative reviews.

Keywords: e-commerce, online shopping, customer reviews, customer behaviour, text analytics, n-grams classification

Procedia PDF Downloads 259

250 Emerging Technology for Business Intelligence Applications

Authors: Hsien-Tsen Wang

Abstract:

Business Intelligence (BI) has long helped organizations make informed decisions based on data-driven insights and gain competitive advantages in the marketplace. In the past two decades, businesses witnessed not only the dramatically increasing volume and heterogeneity of business data but also the emergence of new technologies, such as Artificial Intelligence (AI), Semantic Web (SW), Cloud Computing, and Big Data. It is plausible that the convergence of these technologies would bring more value out of business data by establishing linked data frameworks and connecting in ways that enable advanced analytics and improved data utilization. In this paper, we first review and summarize current BI applications and methodology. Emerging technologies that can be integrated into BI applications are then discussed. Finally, we conclude with a proposed synergy framework that aims at achieving a more flexible, scalable, and intelligent BI solution.

Keywords: business intelligence, artificial intelligence, semantic web, big data, cloud computing

Procedia PDF Downloads 98

249 Video Analytics on Pedagogy Using Big Data

Authors: Jamuna Loganath

Abstract:

Education is the key to the development of any individual’s personality. Today’s students will be tomorrow’s citizens of the global society. The education of the student is the edifice on which his/her future will be built. Schools therefore should provide an all-round development of students so as to foster a healthy society. The behaviors and the attitude of the students in school play an essential role for the success of the education process. Frequent reports of misbehaviors such as clowning, harassing classmates, verbal insults are becoming common in schools today. If this issue is left unattended, it may develop a negative attitude and increase the delinquent behavior. So, the need of the hour is to find a solution to this problem. To solve this issue, it is important to monitor the students’ behaviors in school and give necessary feedback and mentor them to develop a positive attitude and help them to become a successful grownup. Nevertheless, measuring students’ behavior and attitude is extremely challenging. None of the present technology has proven to be effective in this measurement process because actions, reactions, interactions, response of the students are rarely used in the course of the data due to complexity. The purpose of this proposal is to recommend an effective supervising system after carrying out a feasibility study by measuring the behavior of the Students. This can be achieved by equipping schools with CCTV cameras. These CCTV cameras installed in various schools of the world capture the facial expressions and interactions of the students inside and outside their classroom. The real time raw videos captured from the CCTV can be uploaded to the cloud with the help of a network. The video feeds get scooped into various nodes in the same rack or on the different racks in the same cluster in Hadoop HDFS. The video feeds are converted into small frames and analyzed using various Pattern recognition algorithms and MapReduce algorithm. Then, the video frames are compared with the bench marking database (good behavior). When misbehavior is detected, an alert message can be sent to the counseling department which helps them in mentoring the students. This will help in improving the effectiveness of the education process. As Video feeds come from multiple geographical areas (schools from different parts of the world), BIG DATA helps in real time analysis as it analyzes computationally to reveal patterns, trends, and associations, especially relating to human behavior and interactions. It also analyzes data that can’t be analyzed by traditional software applications such as RDBMS, OODBMS. It has also proven successful in handling human reactions with ease. Therefore, BIG DATA could certainly play a vital role in handling this issue. Thus, effectiveness of the education process can be enhanced with the help of video analytics using the latest BIG DATA technology.

Keywords: big data, cloud, CCTV, education process

Procedia PDF Downloads 240

248 A Methodology of Using Fuzzy Logics and Data Analytics to Estimate the Life Cycle Indicators of Solar Photovoltaics

Authors: Thor Alexis Sazon, Alexander Guzman-Urbina, Yasuhiro Fukushima

Abstract:

This study outlines the method of how to develop a surrogate life cycle model based on fuzzy logic using three fuzzy inference methods: (1) the conventional Fuzzy Inference System (FIS), (2) the hybrid system of Data Analytics and Fuzzy Inference (DAFIS), which uses data clustering for defining the membership functions, and (3) the Adaptive-Neuro Fuzzy Inference System (ANFIS), a combination of fuzzy inference and artificial neural network. These methods were demonstrated with a case study where the Global Warming Potential (GWP) and the Levelized Cost of Energy (LCOE) of solar photovoltaic (PV) were estimated using Solar Irradiation, Module Efficiency, and Performance Ratio as inputs. The effects of using different fuzzy inference types, either Sugeno- or Mamdani-type, and of changing the number of input membership functions to the error between the calibration data and the model-generated outputs were also illustrated. The solution spaces of the three methods were consequently examined with a sensitivity analysis. ANFIS exhibited the lowest error while DAFIS gave slightly lower errors compared to FIS. Increasing the number of input membership functions helped with error reduction in some cases but, at times, resulted in the opposite. Sugeno-type models gave errors that are slightly lower than those of the Mamdani-type. While ANFIS is superior in terms of error minimization, it could generate solutions that are questionable, i.e. the negative GWP values of the Solar PV system when the inputs were all at the upper end of their range. This shows that the applicability of the ANFIS models highly depends on the range of cases at which it was calibrated. FIS and DAFIS generated more intuitive trends in the sensitivity runs. DAFIS demonstrated an optimal design point wherein increasing the input values does not improve the GWP and LCOE anymore. In the absence of data that could be used for calibration, conventional FIS presents a knowledge-based model that could be used for prediction. In the PV case study, conventional FIS generated errors that are just slightly higher than those of DAFIS. The inherent complexity of a Life Cycle study often hinders its widespread use in the industry and policy-making sectors. While the methodology does not guarantee a more accurate result compared to those generated by the Life Cycle Methodology, it does provide a relatively simpler way of generating knowledge- and data-based estimates that could be used during the initial design of a system.

Keywords: solar photovoltaic, fuzzy logic, inference system, artificial neural networks

Procedia PDF Downloads 167

247 Data Quality as a Pillar of Data-Driven Organizations: Exploring the Benefits of Data Mesh

Authors: Marc Bachelet, Abhijit Kumar Chatterjee, José Manuel Avila

Abstract:

Data quality is a key component of any data-driven organization. Without data quality, organizations cannot effectively make data-driven decisions, which often leads to poor business performance. Therefore, it is important for an organization to ensure that the data they use is of high quality. This is where the concept of data mesh comes in. Data mesh is an organizational and architectural decentralized approach to data management that can help organizations improve the quality of data. The concept of data mesh was first introduced in 2020. Its purpose is to decentralize data ownership, making it easier for domain experts to manage the data. This can help organizations improve data quality by reducing the reliance on centralized data teams and allowing domain experts to take charge of their data. This paper intends to discuss how a set of elements, including data mesh, are tools capable of increasing data quality. One of the key benefits of data mesh is improved metadata management. In a traditional data architecture, metadata management is typically centralized, which can lead to data silos and poor data quality. With data mesh, metadata is managed in a decentralized manner, ensuring accurate and up-to-date metadata, thereby improving data quality. Another benefit of data mesh is the clarification of roles and responsibilities. In a traditional data architecture, data teams are responsible for managing all aspects of data, which can lead to confusion and ambiguity in responsibilities. With data mesh, domain experts are responsible for managing their own data, which can help provide clarity in roles and responsibilities and improve data quality. Additionally, data mesh can also contribute to a new form of organization that is more agile and adaptable. By decentralizing data ownership, organizations can respond more quickly to changes in their business environment, which in turn can help improve overall performance by allowing better insights into business as an effect of better reports and visualization tools. Monitoring and analytics are also important aspects of data quality. With data mesh, monitoring, and analytics are decentralized, allowing domain experts to monitor and analyze their own data. This will help in identifying and addressing data quality problems in quick time, leading to improved data quality. Data culture is another major aspect of data quality. With data mesh, domain experts are encouraged to take ownership of their data, which can help create a data-driven culture within the organization. This can lead to improved data quality and better business outcomes. Finally, the paper explores the contribution of AI in the coming years. AI can help enhance data quality by automating many data-related tasks, like data cleaning and data validation. By integrating AI into data mesh, organizations can further enhance the quality of their data. The concepts mentioned above are illustrated by AEKIDEN experience feedback. AEKIDEN is an international data-driven consultancy that has successfully implemented a data mesh approach. By sharing their experience, AEKIDEN can help other organizations understand the benefits and challenges of implementing data mesh and improving data quality.

Keywords: data culture, data-driven organization, data mesh, data quality for business success

Procedia PDF Downloads 137