Search results for: video segmentation
974 Automatic Differential Diagnosis of Melanocytic Skin Tumours Using Ultrasound and Spectrophotometric Data
Authors: Kristina Sakalauskiene, Renaldas Raisutis, Gintare Linkeviciute, Skaidra Valiukeviciene
Abstract:
Cutaneous melanoma is a melanocytic skin tumour, which has a very poor prognosis while is highly resistant to treatment and tends to metastasize. Thickness of melanoma is one of the most important biomarker for stage of disease, prognosis and surgery planning. In this study, we hypothesized that the automatic analysis of spectrophotometric images and high-frequency ultrasonic 2D data can improve differential diagnosis of cutaneous melanoma and provide additional information about tumour penetration depth. This paper presents the novel complex automatic system for non-invasive melanocytic skin tumour differential diagnosis and penetration depth evaluation. The system is composed of region of interest segmentation in spectrophotometric images and high-frequency ultrasound data, quantitative parameter evaluation, informative feature extraction and classification with linear regression classifier. The segmentation of melanocytic skin tumour region in ultrasound image is based on parametric integrated backscattering coefficient calculation. The segmentation of optical image is based on Otsu thresholding. In total 29 quantitative tissue characterization parameters were evaluated by using ultrasound data (11 acoustical, 4 shape and 15 textural parameters) and 55 quantitative features of dermatoscopic and spectrophotometric images (using total melanin, dermal melanin, blood and collagen SIAgraphs acquired using spectrophotometric imaging device SIAscope). In total 102 melanocytic skin lesions (including 43 cutaneous melanomas) were examined by using SIAscope and ultrasound system with 22 MHz center frequency single element transducer. The diagnosis and Breslow thickness (pT) of each MST were evaluated during routine histological examination after excision and used as a reference. The results of this study have shown that automatic analysis of spectrophotometric and high frequency ultrasound data can improve non-invasive classification accuracy of early-stage cutaneous melanoma and provide supplementary information about tumour penetration depth.Keywords: cutaneous melanoma, differential diagnosis, high-frequency ultrasound, melanocytic skin tumours, spectrophotometric imaging
Procedia PDF Downloads 270973 Embedded Semantic Segmentation Network Optimized for Matrix Multiplication Accelerator
Authors: Jaeyoung Lee
Abstract:
Autonomous driving systems require high reliability to provide people with a safe and comfortable driving experience. However, despite the development of a number of vehicle sensors, it is difficult to always provide high perceived performance in driving environments that vary from time to season. The image segmentation method using deep learning, which has recently evolved rapidly, provides high recognition performance in various road environments stably. However, since the system controls a vehicle in real time, a highly complex deep learning network cannot be used due to time and memory constraints. Moreover, efficient networks are optimized for GPU environments, which degrade performance in embedded processor environments equipped simple hardware accelerators. In this paper, a semantic segmentation network, matrix multiplication accelerator network (MMANet), optimized for matrix multiplication accelerator (MMA) on Texas instrument digital signal processors (TI DSP) is proposed to improve the recognition performance of autonomous driving system. The proposed method is designed to maximize the number of layers that can be performed in a limited time to provide reliable driving environment information in real time. First, the number of channels in the activation map is fixed to fit the structure of MMA. By increasing the number of parallel branches, the lack of information caused by fixing the number of channels is resolved. Second, an efficient convolution is selected depending on the size of the activation. Since MMA is a fixed, it may be more efficient for normal convolution than depthwise separable convolution depending on memory access overhead. Thus, a convolution type is decided according to output stride to increase network depth. In addition, memory access time is minimized by processing operations only in L3 cache. Lastly, reliable contexts are extracted using the extended atrous spatial pyramid pooling (ASPP). The suggested method gets stable features from an extended path by increasing the kernel size and accessing consecutive data. In addition, it consists of two ASPPs to obtain high quality contexts using the restored shape without global average pooling paths since the layer uses MMA as a simple adder. To verify the proposed method, an experiment is conducted using perfsim, a timing simulator, and the Cityscapes validation sets. The proposed network can process an image with 640 x 480 resolution for 6.67 ms, so six cameras can be used to identify the surroundings of the vehicle as 20 frame per second (FPS). In addition, it achieves 73.1% mean intersection over union (mIoU) which is the highest recognition rate among embedded networks on the Cityscapes validation set.Keywords: edge network, embedded network, MMA, matrix multiplication accelerator, semantic segmentation network
Procedia PDF Downloads 129972 „Real and Symbolic in Poetics of Multiplied Screens and Images“
Authors: Kristina Horvat Blazinovic
Abstract:
In the context of a work of art, one can talk about the idea-concept-term-intention expressed by the artist by using various forms of repetition (external, material, visible repetition). Such repetitions of elements (images in space or moving visual and sound images in time) suggest a "covert", "latent" ("dressed") repetition – i.e., "hidden", "latent" term-intention-idea. Repeating in this way reveals a "deeper truth" that the viewer needs to decode and which is hidden "under" the technical manifestation of the multiplied images. It is not only images, sounds, and screens that are repeated - something else is repeated through them as well, even if, in some cases, the very idea of repetition is repeated. This paper examines serial images and single-channel or multi-channel artwork in the field of video/film art and video installations, which in a way implies the concept of repetition and multiplication. Moving or static images and screens (as multi-screens) are repeated in time and space. The categories of the real and the symbolic partly refer to the Lacan registers of reality, i.e., the Imaginary - Symbolic – Real trinity that represents the orders within which human subjectivity is established. Authors such as Bruce Nauman, VALIE EXPORT, Ragnar Kjartansson, Wolf Vostell, Shirin Neshat, Paul Sharits, Harun Farocki, Dalibor Martinis, Andy Warhol, Douglas Gordon, Bill Viola, Frank Gillette, and Ira Schneider, and Marina Abramovic problematize, in different ways, the concept and procedures of multiplication - repetition, but not in the sense of "copying" and "repetition" of reality or the original, but of repeated repetitions of the simulacrum. Referential works of art are often connected by the theme of the traumatic. Repetitions of images and situations are a response to the traumatic (experience) - repetition itself is a symptom of trauma. On the other hand, repeating and multiplying traumatic images results in a new traumatic effect or cancels it. Reflections on repetition as a temporal and spatial phenomenon are in line with the chapters that link philosophical considerations of space and time and experience temporality with their manifestation in works of art. The observations about time and the relation of perception and memory are according to Henry Bergson and his conception of duration (durée) as "quality of quantity." The video works intended to be displayed as a video loop, express the idea of infinite duration ("pure time," according to Bergson). The Loop wants to be always present - to fixate in time. Wholeness is unrecognizable because the intention is to make the effect infinitely cyclic. Reflections on time and space end with considerations about the occurrence and effects of time and space intervals as places and moments "between" – the points of connection and separation, of continuity and stopping - by reference to the "interval theory" of Soviet filmmaker DzigaVertov. The scale of opportunities that can be explored in interval mode is wide. Intervals represent the perception of time and space in the form of pauses, interruptions, breaks (e.g., emotional, dramatic, or rhythmic) denote emptiness or silence, distance, proximity, interstitial space, or a gap between various states.Keywords: video installation, performance, repetition, multi-screen, real and symbolic, loop, video art, interval, video time
Procedia PDF Downloads 173971 Detecting and Disabling Digital Cameras Using D3CIP Algorithm Based on Image Processing
Authors: S. Vignesh, K. S. Rangasamy
Abstract:
The paper deals with the device capable of detecting and disabling digital cameras. The system locates the camera and then neutralizes it. Every digital camera has an image sensor known as a CCD, which is retro-reflective and sends light back directly to its original source at the same angle. The device shines infrared LED light, which is invisible to the human eye, at a distance of about 20 feet. It then collects video of these reflections with a camcorder. Then the video of the reflections is transferred to a computer connected to the device, where it is sent through image processing algorithms that pick out infrared light bouncing back. Once the camera is detected, the device would project an invisible infrared laser into the camera's lens, thereby overexposing the photo and rendering it useless. Low levels of infrared laser neutralize digital cameras but are neither a health danger to humans nor a physical damage to cameras. We also discuss the simplified design of the above device that can used in theatres to prevent piracy. The domains being covered here are optics and image processing.Keywords: CCD, optics, image processing, D3CIP
Procedia PDF Downloads 357970 Educational Video Capsules for Fostering Teachers Creativity
Authors: Martha Salinas, Valkyria Bernal
Abstract:
Creativity is a possible response to the profound social, economic, and global changes society is living and education is the source to develop this kind of capacity. However, institutional pressures often prevent teachers from engaging in creative teaching practices and make innovation not the main curricular focus when building learning scenarios and experiences. This study proposes and validates the use of a prototype of Educative Video – Capsules from the perspective of teacher training, presenting the different stages of design, the content plan, as well as the influences of its components and characteristics from the perspective of creativity. The paper presents literature findings of the factors that influence the innovative behavior of teachers, the beliefs of teachers about creativity and its nature, as well as the creative pedagogies that have generated better results. The results show that the disposition of teachers towards creative pedagogies improves significantly with the use of a tool that is based on the principles of microlearning and is developed in a non-academic, autonomous, and non-imposed family environment as traditional teacher training processes usually occur.Keywords: educational innovation, resistance to innovation, creativity, creative pedagogy
Procedia PDF Downloads 157969 Fruit Identification System in Sweet Orange Citrus (L.) Osbeck Using Thermal Imaging and Fuzzy
Authors: Ingrid Argote, John Archila, Marcelo Becker
Abstract:
In agriculture, intelligent systems applications have generated great advances in automating some of the processes in the production chain. In order to improve the efficiency of those systems is proposed a vision system to estimate the amount of fruits in sweet orange trees. This work presents a system proposal using capture of thermal images and fuzzy logic. A bibliographical review has been done to analyze the state-of-the-art of the different systems used in fruit recognition, and also the different applications of thermography in agricultural systems. The algorithm developed for this project uses the metrics of the fuzzines parameter to the contrast improvement and segmentation of the image, for the counting algorith m was used the Hough transform. In order to validate the proposed algorithm was created a bank of images of sweet orange Citrus (L.) Osbeck acquired in the Maringá Farm. The tests with the algorithm Indicated that the variation of the tree branch temperature and the fruit is not very high, Which makes the process of image segmentation using this differentiates, This Increases the amount of false positives in the fruit counting algorithm. Recognition of fruits isolated with the proposed algorithm present an overall accuracy of 90.5 % and grouped fruits. The accuracy was 81.3 %. The experiments show the need for a more suitable hardware to have a better recognition of small temperature changes in the image.Keywords: Agricultural systems, Citrus, Fuzzy logic, Thermal images.
Procedia PDF Downloads 229968 Analysis of Q-Learning on Artificial Neural Networks for Robot Control Using Live Video Feed
Authors: Nihal Murali, Kunal Gupta, Surekha Bhanot
Abstract:
Training of artificial neural networks (ANNs) using reinforcement learning (RL) techniques is being widely discussed in the robot learning literature. The high model complexity of ANNs along with the model-free nature of RL algorithms provides a desirable combination for many robotics applications. There is a huge need for algorithms that generalize using raw sensory inputs, such as vision, without any hand-engineered features or domain heuristics. In this paper, the standard control problem of line following robot was used as a test-bed, and an ANN controller for the robot was trained on images from a live video feed using Q-learning. A virtual agent was first trained in simulation environment and then deployed onto a robot’s hardware. The robot successfully learns to traverse a wide range of curves and displays excellent generalization ability. Qualitative analysis of the evolution of policies, performance and weights of the network provide insights into the nature and convergence of the learning algorithm.Keywords: artificial neural networks, q-learning, reinforcement learning, robot learning
Procedia PDF Downloads 372967 Alphabet Recognition Using Pixel Probability Distribution
Authors: Vaidehi Murarka, Sneha Mehta, Dishant Upadhyay
Abstract:
Our project topic is “Alphabet Recognition using pixel probability distribution”. The project uses techniques of Image Processing and Machine Learning in Computer Vision. Alphabet recognition is the mechanical or electronic translation of scanned images of handwritten, typewritten or printed text into machine-encoded text. It is widely used to convert books and documents into electronic files etc. Alphabet Recognition based OCR application is sometimes used in signature recognition which is used in bank and other high security buildings. One of the popular mobile applications includes reading a visiting card and directly storing it to the contacts. OCR's are known to be used in radar systems for reading speeders license plates and lots of other things. The implementation of our project has been done using Visual Studio and Open CV (Open Source Computer Vision). Our algorithm is based on Neural Networks (machine learning). The project was implemented in three modules: (1) Training: This module aims “Database Generation”. Database was generated using two methods: (a) Run-time generation included database generation at compilation time using inbuilt fonts of OpenCV library. Human intervention is not necessary for generating this database. (b) Contour–detection: ‘jpeg’ template containing different fonts of an alphabet is converted to the weighted matrix using specialized functions (contour detection and blob detection) of OpenCV. The main advantage of this type of database generation is that the algorithm becomes self-learning and the final database requires little memory to be stored (119kb precisely). (2) Preprocessing: Input image is pre-processed using image processing concepts such as adaptive thresholding, binarizing, dilating etc. and is made ready for segmentation. “Segmentation” includes extraction of lines, words, and letters from the processed text image. (3) Testing and prediction: The extracted letters are classified and predicted using the neural networks algorithm. The algorithm recognizes an alphabet based on certain mathematical parameters calculated using the database and weight matrix of the segmented image.Keywords: contour-detection, neural networks, pre-processing, recognition coefficient, runtime-template generation, segmentation, weight matrix
Procedia PDF Downloads 389966 Detecting Tomato Flowers in Greenhouses Using Computer Vision
Authors: Dor Oppenheim, Yael Edan, Guy Shani
Abstract:
This paper presents an image analysis algorithm to detect and count yellow tomato flowers in a greenhouse with uneven illumination conditions, complex growth conditions and different flower sizes. The algorithm is designed to be employed on a drone that flies in greenhouses to accomplish several tasks such as pollination and yield estimation. Detecting the flowers can provide useful information for the farmer, such as the number of flowers in a row, and the number of flowers that were pollinated since the last visit to the row. The developed algorithm is designed to handle the real world difficulties in a greenhouse which include varying lighting conditions, shadowing, and occlusion, while considering the computational limitations of the simple processor in the drone. The algorithm identifies flowers using an adaptive global threshold, segmentation over the HSV color space, and morphological cues. The adaptive threshold divides the images into darker and lighter images. Then, segmentation on the hue, saturation and volume is performed accordingly, and classification is done according to size and location of the flowers. 1069 images of greenhouse tomato flowers were acquired in a commercial greenhouse in Israel, using two different RGB Cameras – an LG G4 smartphone and a Canon PowerShot A590. The images were acquired from multiple angles and distances and were sampled manually at various periods along the day to obtain varying lighting conditions. Ground truth was created by manually tagging approximately 25,000 individual flowers in the images. Sensitivity analyses on the acquisition angle of the images, periods throughout the day, different cameras and thresholding types were performed. Precision, recall and their derived F1 score were calculated. Results indicate better performance for the view angle facing the flowers than any other angle. Acquiring images in the afternoon resulted with the best precision and recall results. Applying a global adaptive threshold improved the median F1 score by 3%. Results showed no difference between the two cameras used. Using hue values of 0.12-0.18 in the segmentation process provided the best results in precision and recall, and the best F1 score. The precision and recall average for all the images when using these values was 74% and 75% respectively with an F1 score of 0.73. Further analysis showed a 5% increase in precision and recall when analyzing images acquired in the afternoon and from the front viewpoint.Keywords: agricultural engineering, image processing, computer vision, flower detection
Procedia PDF Downloads 329965 Non-Invasive Data Extraction from Machine Display Units Using Video Analytics
Authors: Ravneet Kaur, Joydeep Acharya, Sudhanshu Gaur
Abstract:
Artificial Intelligence (AI) has the potential to transform manufacturing by improving shop floor processes such as production, maintenance and quality. However, industrial datasets are notoriously difficult to extract in a real-time, streaming fashion thus, negating potential AI benefits. The main example is some specialized industrial controllers that are operated by custom software which complicates the process of connecting them to an Information Technology (IT) based data acquisition network. Security concerns may also limit direct physical access to these controllers for data acquisition. To connect the Operational Technology (OT) data stored in these controllers to an AI application in a secure, reliable and available way, we propose a novel Industrial IoT (IIoT) solution in this paper. In this solution, we demonstrate how video cameras can be installed in a factory shop floor to continuously obtain images of the controller HMIs. We propose image pre-processing to segment the HMI into regions of streaming data and regions of fixed meta-data. We then evaluate the performance of multiple Optical Character Recognition (OCR) technologies such as Tesseract and Google vision to recognize the streaming data and test it for typical factory HMIs and realistic lighting conditions. Finally, we use the meta-data to match the OCR output with the temporal, domain-dependent context of the data to improve the accuracy of the output. Our IIoT solution enables reliable and efficient data extraction which will improve the performance of subsequent AI applications.Keywords: human machine interface, industrial internet of things, internet of things, optical character recognition, video analytics
Procedia PDF Downloads 109964 Video-On-Demand QoE Evaluation across Different Age-Groups and Its Significance for Network Capacity
Authors: Mujtaba Roshan, John A. Schormans
Abstract:
Quality of Experience (QoE) drives churn in the broadband networks industry, and good QoE plays a large part in the retention of customers. QoE is known to be affected by the Quality of Service (QoS) factors packet loss probability (PLP), delay and delay jitter caused by the network. Earlier results have shown that the relationship between these QoS factors and QoE is non-linear, and may vary from application to application. We use the network emulator Netem as the basis for experimentation, and evaluate how QoE varies as we change the emulated QoS metrics. Focusing on Video-on-Demand, we discovered that the reported QoE may differ widely for users of different age groups, and that the most demanding age group (the youngest) can require an order of magnitude lower PLP to achieve the same QoE than is required by the most widely studied age group of users. We then used a bottleneck TCP model to evaluate the capacity cost of achieving an order of magnitude decrease in PLP, and found it be (almost always) a 3-fold increase in link capacity that was required.Keywords: network capacity, packet loss probability, quality of experience, quality of service
Procedia PDF Downloads 273963 A Scalable Media Job Framework for an Open Source Search Engine
Authors: Pooja Mishra, Chris Pollett
Abstract:
This paper explores efficient ways to implement various media-updating features like news aggregation, video conversion, and bulk email handling. All of these jobs share the property that they are periodic in nature, and they all benefit from being handled in a distributed fashion. The data for these jobs also often comes from a social or collaborative source. We isolate the class of periodic, one round map reduce jobs as a useful setting to describe and handle media updating tasks. As such tasks are simpler than general map reduce jobs, programming them in a general map reduce platform could easily become tedious. This paper presents a MediaUpdater module of the Yioop Open Source Search Engine Web Portal designed to handle such jobs via an extension of a PHP class. We describe how to implement various media-updating tasks in our system as well as experiments carried out using these implementations on an Amazon Web Services cluster.Keywords: distributed jobs framework, news aggregation, video conversion, email
Procedia PDF Downloads 298962 Serious Video Games as Literacy and Vocabulary Acquisition Environments for Greek as Second/Foreign Language: The Case of “Einstown”
Authors: Christodoulakis Georgios, Kiourti Elisavet
Abstract:
The Covid-19 pandemic has affected millions of people on a global scale, while lockdowns and quarantine measures were adopted periodically by a vast number of countries. These peculiar socio-historical conditions have led to the growth of participation in online environments. At the same time, the official educational bodies of many countries have been forced, for the first time at least for Greece and Cyprus, to switch to distance learning methods throughout the educational levels. However, this has not been done without issues, both in the technological and functional level, concerning the tools and the processes. Video games are the finest example of simulations of distance learning problem-solving environments. They incorporate different semiotic modes (e.g., a combination of image, sound, texts, gesture) while all this takes place in social and cultural constructed contexts. Players interact in the game environment in terms of spaces, objects, and actions in order to accomplish their goals, solve its problems, and win the game. In addition, players are engaging in layering literacies, which include combinations of independent and collaborative, digital and nondigital practices and spaces acting jointly to support meaning making, including interaction among and across texts and modalities (Abrams, 2017). From this point of view, players are engaged in collaborative, self-directed, and interest-based experiences by going back and forth and around gameplay. Within this context, this paper investigates the way Einstown, a greek serious video game, functions as an effective distance learning environment for teaching Greek as a second|foreign language to adults. The research methodology adopted is the case study approach using mixed methods. The participants were two adult women who are immigrants in Greece and who had zero gaming experience. The results of this research reveal that the videogame Einstown is, in fact, a digital environment of literacy through which the participants achieve active learning, cooperation, and engage in digital and non-digital literacy practices that result in improving the learning of specialized vocabulary presented throughout the gameplay.Keywords: second/foreign language, vocabulary acquisition, literacy, serious video games
Procedia PDF Downloads 154961 Tool for Maxillary Sinus Quantification in Computed Tomography Exams
Authors: Guilherme Giacomini, Ana Luiza Menegatti Pavan, Allan Felipe Fattori Alves, Marcela de Oliveira, Fernando Antonio Bacchim Neto, José Ricardo de Arruda Miranda, Seizo Yamashita, Diana Rodrigues de Pina
Abstract:
The maxillary sinus (MS), part of the paranasal sinus complex, is one of the most enigmatic structures in modern humans. The literature has suggested that MSs function as olfaction accessories, to heat or humidify inspired air, for thermoregulation, to impart resonance to the voice and others. Thus, the real function of the MS is still uncertain. Furthermore, the MS anatomy is complex and varies from person to person. Many diseases may affect the development process of sinuses. The incidence of rhinosinusitis and other pathoses in the MS is comparatively high, so, volume analysis has clinical value. Providing volume values for MS could be helpful in evaluating the presence of any abnormality and could be used for treatment planning and evaluation of the outcome. The computed tomography (CT) has allowed a more exact assessment of this structure, which enables a quantitative analysis. However, this is not always possible in the clinical routine, and if possible, it involves much effort and/or time. Therefore, it is necessary to have a convenient, robust, and practical tool correlated with the MS volume, allowing clinical applicability. Nowadays, the available methods for MS segmentation are manual or semi-automatic. Additionally, manual methods present inter and intraindividual variability. Thus, the aim of this study was to develop an automatic tool to quantity the MS volume in CT scans of paranasal sinuses. This study was developed with ethical approval from the authors’ institutions and national review panels. The research involved 30 retrospective exams of University Hospital, Botucatu Medical School, São Paulo State University, Brazil. The tool for automatic MS quantification, developed in Matlab®, uses a hybrid method, combining different image processing techniques. For MS detection, the algorithm uses a Support Vector Machine (SVM), by features such as pixel value, spatial distribution, shape and others. The detected pixels are used as seed point for a region growing (RG) segmentation. Then, morphological operators are applied to reduce false-positive pixels, improving the segmentation accuracy. These steps are applied in all slices of CT exam, obtaining the MS volume. To evaluate the accuracy of the developed tool, the automatic method was compared with manual segmentation realized by an experienced radiologist. For comparison, we used Bland-Altman statistics, linear regression, and Jaccard similarity coefficient. From the statistical analyses for the comparison between both methods, the linear regression showed a strong association and low dispersion between variables. The Bland–Altman analyses showed no significant differences between the analyzed methods. The Jaccard similarity coefficient was > 0.90 in all exams. In conclusion, the developed tool to quantify MS volume proved to be robust, fast, and efficient, when compared with manual segmentation. Furthermore, it avoids the intra and inter-observer variations caused by manual and semi-automatic methods. As future work, the tool will be applied in clinical practice. Thus, it may be useful in the diagnosis and treatment determination of MS diseases. Providing volume values for MS could be helpful in evaluating the presence of any abnormality and could be used for treatment planning and evaluation of the outcome. The computed tomography (CT) has allowed a more exact assessment of this structure which enables a quantitative analysis. However, this is not always possible in the clinical routine, and if possible, it involves much effort and/or time. Therefore, it is necessary to have a convenient, robust and practical tool correlated with the MS volume, allowing clinical applicability. Nowadays, the available methods for MS segmentation are manual or semi-automatic. Additionally, manual methods present inter and intraindividual variability. Thus, the aim of this study was to develop an automatic tool to quantity the MS volume in CT scans of paranasal sinuses. This study was developed with ethical approval from the authors’ institutions and national review panels. The research involved 30 retrospective exams of University Hospital, Botucatu Medical School, São Paulo State University, Brazil. The tool for automatic MS quantification, developed in Matlab®, uses a hybrid method, combining different image processing techniques. For MS detection, the algorithm uses a Support Vector Machine (SVM), by features such as pixel value, spatial distribution, shape and others. The detected pixels are used as seed point for a region growing (RG) segmentation. Then, morphological operators are applied to reduce false-positive pixels, improving the segmentation accuracy. These steps are applied in all slices of CT exam, obtaining the MS volume. To evaluate the accuracy of the developed tool, the automatic method was compared with manual segmentation realized by an experienced radiologist. For comparison, we used Bland-Altman statistics, linear regression and Jaccard similarity coefficient. From the statistical analyses for the comparison between both methods, the linear regression showed a strong association and low dispersion between variables. The Bland–Altman analyses showed no significant differences between the analyzed methods. The Jaccard similarity coefficient was > 0.90 in all exams. In conclusion, the developed tool to automatically quantify MS volume proved to be robust, fast and efficient, when compared with manual segmentation. Furthermore, it avoids the intra and inter-observer variations caused by manual and semi-automatic methods. As future work, the tool will be applied in clinical practice. Thus, it may be useful in the diagnosis and treatment determination of MS diseases.Keywords: maxillary sinus, support vector machine, region growing, volume quantification
Procedia PDF Downloads 504960 Interactive Shadow Play Animation System
Authors: Bo Wan, Xiu Wen, Lingling An, Xiaoling Ding
Abstract:
The paper describes a Chinese shadow play animation system based on Kinect. Users, without any professional training, can personally manipulate the shadow characters to finish a shadow play performance by their body actions and get a shadow play video through giving the record command to our system if they want. In our system, Kinect is responsible for capturing human movement and voice commands data. Gesture recognition module is used to control the change of the shadow play scenes. After packaging the data from Kinect and the recognition result from gesture recognition module, VRPN transmits them to the server-side. At last, the server-side uses the information to control the motion of shadow characters and video recording. This system not only achieves human-computer interaction, but also realizes the interaction between people. It brings an entertaining experience to users and easy to operate for all ages. Even more important is that the application background of Chinese shadow play embodies the protection of the art of shadow play animation.Keywords: hadow play animation, Kinect, gesture recognition, VRPN, HCI
Procedia PDF Downloads 401959 Portable Glove Controlled Video Game for Hand Rehabilitation
Authors: Vinesh Janarthanan, Mohammad H. Rahman
Abstract:
There are numerous neurological conditions that may result in a loss of motor function. Such conditions may include cerebral palsy, Parkinson’s disease, stroke or multiple sclerosis. Due to impaired motor function, specifically in the hand and arm, living independently becomes tremendously more difficult. Rehabilitation programs are the main method to treat these kinds of disabled individuals. However, these programs require longtime commitment from the clinicians/therapists, demand person to person caring, and typically the treatment duration is usually very long. Aside from the treatment received from the therapist, the continuation of neuroplasticity at home is essential to maximizing development and restoring the biological function. To contribute in this area, we have researched and developed a portable and comfortable hand glove for fine motor skills rehabilitation. The glove provides interactive home-based therapy to engage the patient with simple games. The key to this treatment is the repetition of moving the hand and being capable of positioning the hand in various ways.Keywords: home based, wearable sensors, glove, rehabilitation, motor function, video games
Procedia PDF Downloads 147958 Roadway Infrastructure and Bus Safety
Authors: Richard J. Hanowski, Rebecca L. Hammond
Abstract:
Very few studies have been conducted to investigate safety issues associated with motorcoach/bus operations. The current study investigates the impact that roadway infrastructure, including locality, roadway grade, traffic flow and traffic density, have on bus safety. A naturalistic driving study was conducted in the U.S.A that involved 43 motorcoaches. Two fleets participated in the study and over 600,000 miles of naturalistic driving data were collected. Sixty-five bus drivers participated in this study; 48 male and 17 female. The average age of the drivers was 49 years. A sophisticated data acquisition system (DAS) was installed on each of the 43 motorcoaches and a variety of kinematic and video data were continuously recorded. The data were analyzed by identifying safety critical events (SCEs), which included crashes, near-crashes, crash-relevant conflicts, and unintentional lane deviations. Additionally, baseline (normative driving) segments were also identified and analyzed for comparison to the SCEs. This presentation highlights the need for bus safety research and the methods used in this data collection effort. With respect to elements of roadway infrastructure, this study highlights the methods used to assess locality, roadway grade, traffic flow, and traffic density. Locality was determined by manual review of the recorded video for each event and baseline and was characterized in terms of open country, residential, business/industrial, church, playground, school, urban, airport, interstate, and other. Roadway grade was similarly determined through video review and characterized in terms of level, grade up, grade down, hillcrest, and dip. The video was also used to make a determination of the traffic flow and traffic density at the time of the event or baseline segment. For traffic flow, video was used to assess which of the following best characterized the event or baseline: not divided (2-way traffic), not divided (center 2-way left turn lane), divided (median or barrier), one-way traffic, or no lanes. In terms of traffic density, level-of-service categories were used: A1, A2, B, C, D, E, and F. Highlighted in this abstract are only a few of the many roadway elements that were coded in this study. Other elements included lighting levels, weather conditions, roadway surface conditions, relation to junction, and roadway alignment. Note that a key component of this study was to assess the impact that driver distraction and fatigue have on bus operations. In this regard, once the roadway elements had been coded, the primary research questions that were addressed were (i) “What environmental condition are associated with driver choice of engagement in tasks?”, and (ii) “what are the odds of being in a SCE while engaging in tasks while encountering these conditions?”. The study may be of interest to researchers and traffic engineers that are interested in the relationship between roadway infrastructure elements and safety events in motorcoach bus operations.Keywords: bus safety, motorcoach, naturalistic driving, roadway infrastructure
Procedia PDF Downloads 180957 Current Applications of Artificial Intelligence (AI) in Chest Radiology
Authors: Angelis P. Barlampas
Abstract:
Learning Objectives: The purpose of this study is to inform briefly the reader about the applications of AI in chest radiology. Background: Currently, there are 190 FDA-approved radiology AI applications, with 42 (22%) pertaining specifically to thoracic radiology. Imaging findings OR Procedure details Aids of AI in chest radiology1: Detects and segments pulmonary nodules. Subtracts bone to provide an unobstructed view of the underlying lung parenchyma and provides further information on nodule characteristics, such as nodule location, nodule two-dimensional size or three dimensional (3D) volume, change in nodule size over time, attenuation data (i.e., mean, minimum, and/or maximum Hounsfield units [HU]), morphological assessments, or combinations of the above. Reclassifies indeterminate pulmonary nodules into low or high risk with higher accuracy than conventional risk models. Detects pleural effusion . Differentiates tension pneumothorax from nontension pneumothorax. Detects cardiomegaly, calcification, consolidation, mediastinal widening, atelectasis, fibrosis and pneumoperitoneum. Localises automatically vertebrae segments, labels ribs and detects rib fractures. Measures the distance from the tube tip to the carina and localizes both endotracheal tubes and central vascular lines. Detects consolidation and progression of parenchymal diseases such as pulmonary fibrosis or chronic obstructive pulmonary disease (COPD).Can evaluate lobar volumes. Identifies and labels pulmonary bronchi and vasculature and quantifies air-trapping. Offers emphysema evaluation. Provides functional respiratory imaging, whereby high-resolution CT images are post-processed to quantify airflow by lung region and may be used to quantify key biomarkers such as airway resistance, air-trapping, ventilation mapping, lung and lobar volume, and blood vessel and airway volume. Assesses the lung parenchyma by way of density evaluation. Provides percentages of tissues within defined attenuation (HU) ranges besides furnishing automated lung segmentation and lung volume information. Improves image quality for noisy images with built-in denoising function. Detects emphysema, a common condition seen in patients with history of smoking and hyperdense or opacified regions, thereby aiding in the diagnosis of certain pathologies, such as COVID-19 pneumonia. It aids in cardiac segmentation and calcium detection, aorta segmentation and diameter measurements, and vertebral body segmentation and density measurements. Conclusion: The future is yet to come, but AI already is a helpful tool for the daily practice in radiology. It is assumed, that the continuing progression of the computerized systems and the improvements in software algorithms , will redder AI into the second hand of the radiologist.Keywords: artificial intelligence, chest imaging, nodule detection, automated diagnoses
Procedia PDF Downloads 72956 The Development of Integrated Real-Life Video and Animation with Addie Based on Constructive for Improving Students’ Mastery Concept in Rotational Dynamics
Authors: Silka Abyadati, Dadi Rusdiana, Enjang Akhmad Juanda
Abstract:
This study aims to investigate the students’ mastery concepts enhancement between students who are studying by using Integrated Real-Life Video and Animation (IRVA) and students who are studying without using IRVA. The development of IRVA is conducted by five stages: Analyze, Design, Development, Implementation and Evaluation (ADDIE) based on constructivist for Rotational Dynamics material in Physics learning. A constructivist model-based learning used is Interpretation Construction (ICON), which has the following phases: 1) Observation, 2) Construction interpretation, 3) Contextualization prior knowledge, 4) Conflict cognitive, 5) Learning cognitive, 6) Collaboration, 7) Multiple interpretation, 8) Multiple manifestation. The IRVA is developed for the stages of observation, cognitive conflict and cognitive learning. The sample of this study consisted of 32 students experimental group and a control group of 32 students in class XI of the school year 2015/2016 in one of Senior High Schools Bandung. The study was conducted by giving the pretest and posttest in the form of 20 items of multiple choice questions to determine the enhancement of mastery concept of Rotational Dynamics. Hypothesis testing is done by using T-test on the value of N-gain average of mastery concepts. The results showed that there is a significant difference in an enhancement of students’ mastery concepts between students who are studying by using IRVA and students who are studying without IRVA. Students in the experimental group increased by 0.468 while students in the control group increased by 0.207.Keywords: ADDIE, constructivist learning, Integrated Real-Life Video and Animation, mastery concepts, rotational dynamics
Procedia PDF Downloads 231955 Content Analysis of Video Translations: Examining the Linguistic and Thematic Approach by Translator Abdullah Khrief on the X Platform
Authors: Easa Almustanyir
Abstract:
This study investigates the linguistic and thematic approach of translator Abdullah Khrief in the context of video translations on the X platform. The sample comprises 15 videos from Khrief's account, covering diverse content categories like science, religion, social issues, personal experiences, lifestyle, and culture. The analysis focuses on two aspects: language usage and thematic representation. Regarding language, the study examines the prevalence of English while considering the inclusion of French and German content, highlighting Khrief's multilingual versatility and ability to navigate cultural nuances. Thematically, the study explores the diverse range of topics covered, encompassing scientific, religious, social, and personal narratives, underscoring Khrief's broad subject matter expertise and commitment to knowledge dissemination. The study employs a mixed-methods approach, combining quantitative data analysis with qualitative content analysis. Statistical data on video languages, presenter genders, and content categories are analyzed, and a thorough content analysis assesses translation accuracy, cultural appropriateness, and overall quality. Preliminary findings indicate a high level of professionalism and expertise in Khrief's translations. The absence of errors across the diverse range of videos establishes his credibility and trustworthiness. Furthermore, the accurate representation of cultural nuances and sensitive topics highlights Khrief's cultural sensitivity and commitment to preserving intended meanings and emotional resonance.Keywords: audiovisual translation, linguistic versatility, thematic diversity, cultural sensitivity, content analysis, mixed-methods approach
Procedia PDF Downloads 17954 Capturing the Stress States in Video Conferences by Photoplethysmographic Pulse Detection
Authors: Jarek Krajewski, David Daxberger
Abstract:
We propose a stress detection method based on an RGB camera using heart rate detection, also known as Photoplethysmography Imaging (PPGI). This technique focuses on the measurement of the small changes in skin colour caused by blood perfusion. A stationary lab setting with simulated video conferences is chosen using constant light conditions and a sampling rate of 30 fps. The ground truth measurement of heart rate is conducted with a common PPG system. The proposed approach for pulse peak detection is based on a machine learning-based approach, applying brute force feature extraction for the prediction of heart rate pulses. The statistical analysis showed good agreement (correlation r = .79, p<0.05) between the reference heart rate system and the proposed method. Based on these findings, the proposed method could provide a reliable, low-cost, and contactless way of measuring HR parameters in daily-life environments.Keywords: heart rate, PPGI, machine learning, brute force feature extraction
Procedia PDF Downloads 123953 Thick Data Techniques for Identifying Abnormality in Video Frames for Wireless Capsule Endoscopy
Authors: Jinan Fiaidhi, Sabah Mohammed, Petros Zezos
Abstract:
Capsule endoscopy (CE) is an established noninvasive diagnostic modality in investigating small bowel disease. CE has a pivotal role in assessing patients with suspected bleeding or identifying evidence of active Crohn's disease in the small bowel. However, CE produces lengthy videos with at least eighty thousand frames, with a frequency rate of 2 frames per second. Gastroenterologists cannot dedicate 8 to 15 hours to reading the CE video frames to arrive at a diagnosis. This is why the issue of analyzing CE videos based on modern artificial intelligence techniques becomes a necessity. However, machine learning, including deep learning, has failed to report robust results because of the lack of large samples to train its neural nets. In this paper, we are describing a thick data approach that learns from a few anchor images. We are using sound datasets like KVASIR and CrohnIPI to filter candidate frames that include interesting anomalies in any CE video. We are identifying candidate frames based on feature extraction to provide representative measures of the anomaly, like the size of the anomaly and the color contrast compared to the image background, and later feed these features to a decision tree that can classify the candidate frames as having a condition like the Crohn's Disease. Our thick data approach reported accuracy of detecting Crohn's Disease based on the availability of ulcer areas at the candidate frames for KVASIR was 89.9% and for the CrohnIPI was 83.3%. We are continuing our research to fine-tune our approach by adding more thick data methods for enhancing diagnosis accuracy.Keywords: thick data analytics, capsule endoscopy, Crohn’s disease, siamese neural network, decision tree
Procedia PDF Downloads 156952 Vehicle Timing Motion Detection Based on Multi-Dimensional Dynamic Detection Network
Authors: Jia Li, Xing Wei, Yuchen Hong, Yang Lu
Abstract:
Detecting vehicle behavior has always been the focus of intelligent transportation, but with the explosive growth of the number of vehicles and the complexity of the road environment, the vehicle behavior videos captured by traditional surveillance have been unable to satisfy the study of vehicle behavior. The traditional method of manually labeling vehicle behavior is too time-consuming and labor-intensive, but the existing object detection and tracking algorithms have poor practicability and low behavioral location detection rate. This paper proposes a vehicle behavior detection algorithm based on the dual-stream convolution network and the multi-dimensional video dynamic detection network. In the videos, the straight-line behavior of the vehicle will default to the background behavior. The Changing lanes, turning and turning around are set as target behaviors. The purpose of this model is to automatically mark the target behavior of the vehicle from the untrimmed videos. First, the target behavior proposals in the long video are extracted through the dual-stream convolution network. The model uses a dual-stream convolutional network to generate a one-dimensional action score waveform, and then extract segments with scores above a given threshold M into preliminary vehicle behavior proposals. Second, the preliminary proposals are pruned and identified using the multi-dimensional video dynamic detection network. Referring to the hierarchical reinforcement learning, the multi-dimensional network includes a Timer module and a Spacer module, where the Timer module mines time information in the video stream and the Spacer module extracts spatial information in the video frame. The Timer and Spacer module are implemented by Long Short-Term Memory (LSTM) and start from an all-zero hidden state. The Timer module uses the Transformer mechanism to extract timing information from the video stream and extract features by linear mapping and other methods. Finally, the model fuses time information and spatial information and obtains the location and category of the behavior through the softmax layer. This paper uses recall and precision to measure the performance of the model. Extensive experiments show that based on the dataset of this paper, the proposed model has obvious advantages compared with the existing state-of-the-art behavior detection algorithms. When the Time Intersection over Union (TIoU) threshold is 0.5, the Average-Precision (MP) reaches 36.3% (the MP of baselines is 21.5%). In summary, this paper proposes a vehicle behavior detection model based on multi-dimensional dynamic detection network. This paper introduces spatial information and temporal information to extract vehicle behaviors in long videos. Experiments show that the proposed algorithm is advanced and accurate in-vehicle timing behavior detection. In the future, the focus will be on simultaneously detecting the timing behavior of multiple vehicles in complex traffic scenes (such as a busy street) while ensuring accuracy.Keywords: vehicle behavior detection, convolutional neural network, long short-term memory, deep learning
Procedia PDF Downloads 130951 Perceiving Casual Speech: A Gating Experiment with French Listeners of L2 English
Authors: Naouel Zoghlami
Abstract:
Spoken-word recognition involves the simultaneous activation of potential word candidates which compete with each other for final correct recognition. In continuous speech, the activation-competition process gets more complicated due to speech reductions existing at word boundaries. Lexical processing is more difficult in L2 than in L1 because L2 listeners often lack phonetic, lexico-semantic, syntactic, and prosodic knowledge in the target language. In this study, we investigate the on-line lexical segmentation hypotheses that French listeners of L2 English form and then revise as subsequent perceptual evidence is revealed. Our purpose is to shed further light on the processes of L2 spoken-word recognition in context and better understand L2 listening difficulties through a comparison of skilled and unskilled reactions at the point where their working hypothesis is rejected. We use a variant of the gating experiment in which subjects transcribe an English sentence presented in increments of progressively greater duration. The spoken sentence was “And this amazing athlete has just broken another world record”, chosen mainly because it included common reductions and phonetic features in English, such as elision and assimilation. Our preliminary results show that there is an important difference in the manner in which proficient and less-proficient L2 listeners handle connected speech. Less-proficient listeners delay recognition of words as they wait for lexical and syntactic evidence to appear in the gates. Further statistical results are currently being undertaken.Keywords: gating paradigm, spoken word recognition, online lexical segmentation, L2 listening
Procedia PDF Downloads 464950 [Keynote Talk]: Computer-Assisted Language Learning (CALL) for Teaching English to Speakers of Other Languages (TESOL/ESOL) as a Foreign Language (TEFL/EFL), Second Language (TESL/ESL), or Additional Language (TEAL/EAL)
Authors: Andrew Laghos
Abstract:
Computer-assisted language learning (CALL) is defined as the use of computers to help learn languages. In this study we look at several different types of CALL tools and applications and how they can assist Adults and Young Learners in learning the English language as a foreign, second or additional language. It is important to identify the roles of the teacher and the learners, and what the learners’ motivations are for learning the language. Audio, video, interactive multimedia games, online translation services, conferencing, chat rooms, discussion forums, social networks, social media, email communication, songs and music video clips are just some of the many ways computers are currently being used to enhance language learning. CALL may be used for classroom teaching as well as for online and mobile learning. Advantages and disadvantages of CALL are discussed and the study ends with future predictions of CALL.Keywords: computer-assisted language learning (CALL), teaching English as a foreign language (TEFL/EFL), adult learners, young learners
Procedia PDF Downloads 434949 DenseNet and Autoencoder Architecture for COVID-19 Chest X-Ray Image Classification and Improved U-Net Lung X-Ray Segmentation
Authors: Jonathan Gong
Abstract:
Purpose AI-driven solutions are at the forefront of many pathology and medical imaging methods. Using algorithms designed to better the experience of medical professionals within their respective fields, the efficiency and accuracy of diagnosis can improve. In particular, X-rays are a fast and relatively inexpensive test that can diagnose diseases. In recent years, X-rays have not been widely used to detect and diagnose COVID-19. The under use of Xrays is mainly due to the low diagnostic accuracy and confounding with pneumonia, another respiratory disease. However, research in this field has expressed a possibility that artificial neural networks can successfully diagnose COVID-19 with high accuracy. Models and Data The dataset used is the COVID-19 Radiography Database. This dataset includes images and masks of chest X-rays under the labels of COVID-19, normal, and pneumonia. The classification model developed uses an autoencoder and a pre-trained convolutional neural network (DenseNet201) to provide transfer learning to the model. The model then uses a deep neural network to finalize the feature extraction and predict the diagnosis for the input image. This model was trained on 4035 images and validated on 807 separate images from the ones used for training. The images used to train the classification model include an important feature: the pictures are cropped beforehand to eliminate distractions when training the model. The image segmentation model uses an improved U-Net architecture. This model is used to extract the lung mask from the chest X-ray image. The model is trained on 8577 images and validated on a validation split of 20%. These models are calculated using the external dataset for validation. The models’ accuracy, precision, recall, f1-score, IOU, and loss are calculated. Results The classification model achieved an accuracy of 97.65% and a loss of 0.1234 when differentiating COVID19-infected, pneumonia-infected, and normal lung X-rays. The segmentation model achieved an accuracy of 97.31% and an IOU of 0.928. Conclusion The models proposed can detect COVID-19, pneumonia, and normal lungs with high accuracy and derive the lung mask from a chest X-ray with similarly high accuracy. The hope is for these models to elevate the experience of medical professionals and provide insight into the future of the methods used.Keywords: artificial intelligence, convolutional neural networks, deep learning, image processing, machine learning
Procedia PDF Downloads 130948 Cross-Tier Collaboration between Preservice and Inservice Language Teachers in Designing Online Video-Based Pragmatic Assessment
Authors: Mei-Hui Liu
Abstract:
This paper reports the progression of language teachers’ learning to assess students’ speech act performance via online videos in a cross-tier professional growth community. This yearlong research project collected multiple data sources from several stakeholders, including 12 preservice and 4 inservice English as a foreign language (EFL) teachers, 4 English professionals, and 82 high school students. Data sources included surveys, (focus group) interviews, online reflection journals, online video-based assessment items/scores, and artifacts related to teacher professional learning. The major findings depicted the effectiveness of this proposed learning module on language teacher development in pragmatic assessment as well as its impact on student learning experience. All these teachers appreciated this professional learning experience which enhanced their knowledge in assessing students’ pragmalinguistic and sociopragmatic performance in an English speech act (i.e., making refusals). They learned how to design online video-based assessment items by attending to specific linguistic structures, semantic formula, and sociocultural issues. They further became aware of how to sharpen pragmatic instructional skills in the near future after putting theories into online assessment and related classroom practices. Additionally, data analysis revealed students’ achievement in and satisfaction with the designed online assessment. Yet, during the professional learning process most participating teachers encountered challenges in reaching a consensus on selecting appropriate video clips from available sources to present the sociocultural values in English-speaking refusal contexts. Also included was to construct test items which could testify the influence of interlanguage transfer on students’ pragmatic performance in various conversational scenarios. With pedagogical implications and research suggestions, this study adds to the increasing amount of research into integrating preservice and inservice EFL teacher education in pragmatic assessment and relevant instruction. Acknowledgment: This research project is sponsored by the Ministry of Science and Technology in the Republic of China under the grant number of MOST 106-2410-H-029-038.Keywords: cross-tier professional development, inservice EFL teachers, pragmatic assessment, preservice EFL teachers, student learning experience
Procedia PDF Downloads 259947 A Topological Approach for Motion Track Discrimination
Authors: Tegan H. Emerson, Colin C. Olson, George Stantchev, Jason A. Edelberg, Michael Wilson
Abstract:
Detecting small targets at range is difficult because there is not enough spatial information present in an image sub-region containing the target to use correlation-based methods to differentiate it from dynamic confusers present in the scene. Moreover, this lack of spatial information also disqualifies the use of most state-of-the-art deep learning image-based classifiers. Here, we use characteristics of target tracks extracted from video sequences as data from which to derive distinguishing topological features that help robustly differentiate targets of interest from confusers. In particular, we calculate persistent homology from time-delayed embeddings of dynamic statistics calculated from motion tracks extracted from a wide field-of-view video stream. In short, we use topological methods to extract features related to target motion dynamics that are useful for classification and disambiguation and show that small targets can be detected at range with high probability.Keywords: motion tracks, persistence images, time-delay embedding, topological data analysis
Procedia PDF Downloads 114946 An Investigation into Computer Vision Methods to Identify Material Other Than Grapes in Harvested Wine Grape Loads
Authors: Riaan Kleyn
Abstract:
Mass wine production companies across the globe are provided with grapes from winegrowers that predominantly utilize mechanical harvesting machines to harvest wine grapes. Mechanical harvesting accelerates the rate at which grapes are harvested, allowing grapes to be delivered faster to meet the demands of wine cellars. The disadvantage of the mechanical harvesting method is the inclusion of material-other-than-grapes (MOG) in the harvested wine grape loads arriving at the cellar which degrades the quality of wine that can be produced. Currently, wine cellars do not have a method to determine the amount of MOG present within wine grape loads. This paper seeks to find an optimal computer vision method capable of detecting the amount of MOG within a wine grape load. A MOG detection method will encourage winegrowers to deliver MOG-free wine grape loads to avoid penalties which will indirectly enhance the quality of the wine to be produced. Traditional image segmentation methods were compared to deep learning segmentation methods based on images of wine grape loads that were captured at a wine cellar. The Mask R-CNN model with a ResNet-50 convolutional neural network backbone emerged as the optimal method for this study to determine the amount of MOG in an image of a wine grape load. Furthermore, a statistical analysis was conducted to determine how the MOG on the surface of a grape load relates to the mass of MOG within the corresponding grape load.Keywords: computer vision, wine grapes, machine learning, machine harvested grapes
Procedia PDF Downloads 94945 Web Page Design Optimisation Based on Segment Analytics
Authors: Varsha V. Rohini, P. R. Shreya, B. Renukadevi
Abstract:
In the web analytics the information delivery and the web usage is optimized and the analysis of data is done. The analytics is the measurement, collection and analysis of webpage data. Page statistics and user metrics are the important factor in most of the web analytics tool. This is the limitation of the existing tools. It does not provide design inputs for the optimization of information. This paper aims at providing an extension for the scope of web analytics to provide analysis and statistics of each segment of a webpage. The number of click count is calculated and the concentration of links in a web page is obtained. Its user metrics are used to help in proper design of the displayed content in a webpage by Vision Based Page Segmentation (VIPS) algorithm. When the algorithm is applied on the web page it divides the entire web page into the visual block tree. The visual block tree generated will further divide the web page into visual blocks or segments which help us to understand the usage of each segment in a page and its content. The dynamic web pages and deep web pages are used to extend the scope of web page segment analytics. Space optimization concept is used with the help of the output obtained from the Vision Based Page Segmentation (VIPS) algorithm. This technique provides us the visibility of the user interaction with the WebPages and helps us to place the important links in the appropriate segments of the webpage and effectively manage space in a page and the concentration of links.Keywords: analytics, design optimization, visual block trees, vision based technology
Procedia PDF Downloads 266