Arabic Speaker Recognition: Babylon Levantine Subset Case Study
Abstract
Problem statement: Researchers on Arabic speaker recognition have used local data bases unavailable to the public. In this study we would like to investigate Arabic speaker recognition using a publically available database, namely Babylon Levantine available from the Linguistic Data Consortium (LDC). Approach: Among the different methods for speaker recognition we focus on Hidden Markov Models (HMM). We studied the effect of both the parameters of the HMM models and the size of the speech features on the recognition rate. Results: To accomplish this study, we divided the database into small and medium size datasets. For each subset, we found the effect of the system parameters on the recognition rate. The parameters we varied the number of HMM states, the number of Gaussian mixtures per state, and the number of speech features coefficients. From the results, we found that in general, the recognition rate increases with the increase in the number of mixtures, till it reaches a saturation level which depends on the data size and the number of HMM states. Conclusion/Recommendations: The effect of the number of state depends on the data size. For small data, low number of states has higher recognition rate. For larger data, the number of states has very small effect at low number of mixtures and negligible effect at high number of mixtures.
DOI: https://doi.org/10.3844/jcssp.2010.381.385
Copyright: © 2010 Mansour Alsulaiman, Youssef Alotaibi, Muhammad Ghulam, Mohamed A. Bencherif and Awais Mahmoud. This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
- 3,508 Views
- 2,593 Downloads
- 6 Citations
Download
Keywords
- HMM
- GMM
- MFCC
- Arabic speaker
- Babylon
- Levantine