Commenced in January 2007

Frequency: Monthly

Edition: International

Paper Count: 39

Search results for: dictionary

9 Providing Medical Information in Braille: Research and Development of Automatic Braille Translation Program for Japanese “eBraille“

Authors: Aki Sugano, Mika Ohta, Mineko Ikegami, Kenji Miura, Sayo Tsukamoto, Akihiro Ichinose, Toshiko Ohshima, Eiichi Maeda, Masako Matsuura, Yutaka Takao

Abstract:

Along with the advances in medicine, providing medical information to individual patient is becoming more important. In Japan such information via Braille is hardly provided to blind and partially sighted people. Thus we are researching and developing a Web-based automatic translation program “eBraille" to translate Japanese text into Japanese Braille. First we analyzed the Japanese transcription rules to implement them on our program. We then added medical words to the dictionary of the program to improve its translation accuracy for medical text. Finally we examined the efficacy of statistical learning models (SLMs) for further increase of word segmentation accuracy in braille translation. As a result, eBraille had the highest translation accuracy in the comparison with other translation programs, improved the accuracy for medical text and is utilized to make hospital brochures in braille for outpatients and inpatients.

Keywords: Automatic Braille translation, Medical text, Partially sighted people.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1583

8 Identification of Non-Lexicon Non-Slang Unigrams in Body-enhancement Medicinal UBE

Authors: Jatinderkumar R. Saini, Apurva A. Desai

Abstract:

Email has become a fast and cheap means of online communication. The main threat to email is Unsolicited Bulk Email (UBE), commonly called spam email. The current work aims at identification of unigrams in more than 2700 UBE that advertise body-enhancement drugs. The identification is based on the requirement that the unigram is neither present in dictionary, nor is a slang term. The motives of the paper are many fold. This is an attempt to analyze spamming behaviour and employment of wordmutation technique. On the side-lines of the paper, we have attempted to better understand the spam, the slang and their interplay. The problem has been addressed by employing Tokenization technique and Unigram BOW model. We found that the non-lexicon words constitute nearly 66% of total number of lexis of corpus whereas non-slang words constitute nearly 2.4% of non-lexicon words. Further, non-lexicon non-slang unigrams composed of 2 lexicon words, form more than 71% of the total number of such unigrams. To the best of our knowledge, this is the first attempt to analyze usage of non-lexicon non-slang unigrams in any kind of UBE.

Keywords: Body Enhancement, Lexicon, Medicinal, Slang, Unigram, Unsolicited Bulk e-mail (UBE)

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1793

7 A Sparse Representation Speech Denoising Method Based on Adapted Stopping Residue Error

Authors: Qianhua He, Weili Zhou, Aiwu Chen

Abstract:

A sparse representation speech denoising method based on adapted stopping residue error was presented in this paper. Firstly, the cross-correlation between the clean speech spectrum and the noise spectrum was analyzed, and an estimation method was proposed. In the denoising method, an over-complete dictionary of the clean speech power spectrum was learned with the K-singular value decomposition (K-SVD) algorithm. In the sparse representation stage, the stopping residue error was adaptively achieved according to the estimated cross-correlation and the adjusted noise spectrum, and the orthogonal matching pursuit (OMP) approach was applied to reconstruct the clean speech spectrum from the noisy speech. Finally, the clean speech was re-synthesised via the inverse Fourier transform with the reconstructed speech spectrum and the noisy speech phase. The experiment results show that the proposed method outperforms the conventional methods in terms of subjective and objective measure.

Keywords: Speech denoising, sparse representation, K-singular value decomposition, orthogonal matching pursuit.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 993

6 Particular Features of the First Romanian Multilingual Dictionaries

Authors: Mihaela Mocanu

Abstract:

The Romanian multilingual dictionaries – also named polyglot, plurilingual or polylingual dictionaries, have known a slow yet constant development starting with the end of the 17th century, when the first such work is attested, to the present time, when we witness a considerable increase of the number of polyglot dictionaries, especially the terminological ones. This paper aims at analyzing the context in which the first Romanian multilingual dictionaries were issued, as well as and the organization and structure particularities of the first lexicographic works of this type. The irretrievable loss of some of these works as well as the partial conservation of others renders the attempt to retrace the beginnings of Romanian lexicography extremely difficult. The research methodology is part of a descriptive and analytical approach based on two types of sources, subject to contrastive analysis: the notes made by the initiators of lexicographic projects and the testimonies of their contemporaries, respectively, along with the specialized studies regarding the history of the old Romanian lexicography. The analysis of the contents has indicated that these dictionaries lacked a scientific apparatus in the true sense of the phrase, failed to obey unitary organizational criteria, being limited, most of the times, to mere inventories of words, where the Romanian term was assigned its correspondent in other languages. Motivated by practical reasons, the first multilingual dictionaries were aimed at the clerics their purpose being to ensure the translators’ fidelity towards the original religious texts, regarded as sacred.

Keywords: Language, multilingual dictionary, Romanian lexicography, terminology.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1385

5 A Simple Adaptive Atomic Decomposition Voice Activity Detector Implemented by Matching Pursuit

Authors: Thomas Bryan, Veton Kepuska, Ivica Kostanic

Abstract:

A simple adaptive voice activity detector (VAD) is implemented using Gabor and gammatone atomic decomposition of speech for high Gaussian noise environments. Matching pursuit is used for atomic decomposition, and is shown to achieve optimal speech detection capability at high data compression rates for low signal to noise ratios. The most active dictionary elements found by matching pursuit are used for the signal reconstruction so that the algorithm adapts to the individual speakers dominant time-frequency characteristics. Speech has a high peak to average ratio enabling matching pursuit greedy heuristic of highest inner products to isolate high energy speech components in high noise environments. Gabor and gammatone atoms are both investigated with identical logarithmically spaced center frequencies, and similar bandwidths. The algorithm performs equally well for both Gabor and gammatone atoms with no significant statistical differences. The algorithm achieves 70% accuracy at a 0 dB SNR, 90% accuracy at a 5 dB SNR and 98% accuracy at a 20dB SNR using 30d B SNR as a reference for voice activity.

Keywords: Atomic Decomposition, Gabor, Gammatone, Matching Pursuit, Voice Activity Detection.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1773

4 A Computer Aided Detection (CAD) System for Microcalcifications in Mammograms - MammoScan mCaD

Authors: Kjersti Engan, Thor Ole Gulsrud, Karl Fredrik Fretheim, Barbro Furebotten Iversen, Liv Eriksen

Abstract:

Clusters of microcalcifications in mammograms are an important sign of breast cancer. This paper presents a complete Computer Aided Detection (CAD) scheme for automatic detection of clustered microcalcifications in digital mammograms. The proposed system, MammoScan μCaD, consists of three main steps. Firstly all potential microcalcifications are detected using a a method for feature extraction, VarMet, and adaptive thresholding. This will also give a number of false detections. The goal of the second step, Classifier level 1, is to remove everything but microcalcifications. The last step, Classifier level 2, uses learned dictionaries and sparse representations as a texture classification technique to distinguish single, benign microcalcifications from clustered microcalcifications, in addition to remove some remaining false detections. The system is trained and tested on true digital data from Stavanger University Hospital, and the results are evaluated by radiologists. The overall results are promising, with a sensitivity > 90 % and a low false detection rate (approx 1 unwanted pr. image, or 0.3 false pr. image).

Keywords: mammogram, microcalcifications, detection, CAD, MammoScan μCaD, VarMet, dictionary learning, texture, FTCM, classification, adaptive thresholding

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1787

3 Weighted-Distance Sliding Windows and Cooccurrence Graphs for Supporting Entity-Relationship Discovery in Unstructured Text

Authors: Paolo Fantozzi, Luigi Laura, Umberto Nanni

Abstract:

The problem of Entity relation discovery in structured data, a well covered topic in literature, consists in searching within unstructured sources (typically, text) in order to find connections among entities. These can be a whole dictionary, or a specific collection of named items. In many cases machine learning and/or text mining techniques are used for this goal. These approaches might be unfeasible in computationally challenging problems, such as processing massive data streams. A faster approach consists in collecting the cooccurrences of any two words (entities) in order to create a graph of relations - a cooccurrence graph. Indeed each cooccurrence highlights some grade of semantic correlation between the words because it is more common to have related words close each other than having them in the opposite sides of the text. Some authors have used sliding windows for such problem: they count all the occurrences within a sliding windows running over the whole text. In this paper we generalise such technique, coming up to a Weighted-Distance Sliding Window, where each occurrence of two named items within the window is accounted with a weight depending on the distance between items: a closer distance implies a stronger evidence of a relationship. We develop an experiment in order to support this intuition, by applying this technique to a data set consisting in the text of the Bible, split into verses.

Keywords: Cooccurrence graph, entity relation graph, unstructured text, weighted distance.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 662

2 A Cost Effective Approach to Develop Mid-size Enterprise Software Adopted the Waterfall Model

Authors: M. N. Hasnine, M. K. H. Chayon, M. M. Rahman

Abstract:

Organizational tendencies towards computer-based information processing have been observed noticeably in the third-world countries. Many enterprises are taking major initiatives towards computerized working environment because of massive benefits of computer-based information processing. However, designing and developing information resource management software for small and mid-size enterprises under budget costs and strict deadline is always challenging for software engineers. Therefore, we introduced an approach to design mid-size enterprise software by using the Waterfall model, which is one of the SDLC (Software Development Life Cycles), in a cost effective way. To fulfill research objectives, in this study, we developed mid-sized enterprise software named “BSK Management System” that assists enterprise software clients with information resource management and perform complex organizational tasks. Waterfall model phases have been applied to ensure that all functions, user requirements, strategic goals, and objectives are met. In addition, Rich Picture, Structured English, and Data Dictionary have been implemented and investigated properly in engineering manner. Furthermore, an assessment survey with 20 participants has been conducted to investigate the usability and performance of the proposed software. The survey results indicated that our system featured simple interfaces, easy operation and maintenance, quick processing, and reliable and accurate transactions.

Keywords: End-user Application Development, Enterprise Software Design, Information Resource Management, Usability.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1931

1 Inner and Outer School Contextual Factors Associated with Poor Performance of Grade 12 Students: A Case Study of an Underperforming High School in Mpumalanga, South Africa

Authors: Victoria L. Nkosi, Parvaneh Farhangpour

Abstract:

Often a Grade 12 certificate is perceived as a passport to tertiary education and the minimum requirement to enter the world of work. In spite of its importance, many students do not make this milestone in South Africa. It is important to find out why so many students still fail in spite of transformation in the education system in the post-apartheid era. Given the complexity of education and its context, this study adopted a case study design to examine one historically underperforming high school in Bushbuckridge, Mpumalanga Province, South Africa in 2013. The aim was to gain a understanding of the inner and outer school contextual factors associated with the high failure rate among Grade 12 students. Government documents and reports were consulted to identify factors in the district and the village surrounding the school and a student survey was conducted to identify school, home and student factors. The randomly-sampled half of the population of Grade 12 students (53) participated in the survey and quantitative data are analyzed using descriptive statistical methods. The findings showed that a host of factors is at play. The school is located in a village within a municipality which has been one of the poorest three municipalities in South Africa and the lowest Grade 12 pass rate in the Mpumalanga province. Moreover, over half of the families of the students are single parents, 43% are unemployed and the majority has a low level of education. In addition, most families (83%) do not have basic study materials such as a dictionary, books, tables, and chairs. A significant number of students (70%) are over-aged (+19 years old); close to half of them (49%) are grade repeaters. The school itself lacks essential resources, namely computers, science laboratories, library, and enough furniture and textbooks. Moreover, teaching and learning are negatively affected by the teachers’ occasional absenteeism, inadequate lesson preparation, and poor communication skills. Overall, the continuous low performance of students in this school mirrors the vicious circle of multiple negative conditions present within and outside of the school. The complexity of factors associated with the underperformance of Grade 12 students in this school calls for a multi-dimensional intervention from government and stakeholders. One important intervention should be the placement of over-aged students and grade-repeaters in suitable educational institutions for the benefit of other students.

Keywords: Inner context, outer context, over-aged students, vicious circle.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1225