Dataset Explorer

LibriSpeech

About

The LibriSpeech dev-clean dataset offers 5 hours of high-quality English audiobook speech for training and testing Automatic Speech Recognition (ASR) systems. With diverse speakers and clean audio, it's ideal for initial experimentation and model development within a manageable size. Remember, this is just a segment of the broader LibriSpeech corpus, and for real-world scenarios, exploring audio with varying qualities is recommended.

322.29 MB Dataset Size