The xcorrSound tool compares sound waves using cross correlation and finds the best overlap. The package contains several tools.
- overlap-analysis (earlier xcorrSound) is a tool to find the overlap between two audio files. It outputs best sample number and second and value of match.
- sound-match is a tool to find all occurrences of a shorter wav within a larger wav
- waveform-compare (earlier migration-qa) is a tool that splits two audio files into equal sized blocks (default 5 seconds) and outputs the correlation for each block (a_i,b_i), if a and b was the input
- sound_index is a tool to build an index in which sound-match can find all occurrences of a shorter wav (still in development phase)
The tools all make use of cross correlation, which can be computed through the Fourier transform.
This tool was developed and used at The Danish State and University Library in the following scenarios.
The State and University Library in Denmark possesses a large amount of digital audio files which are radio broadcasts from DR (Danmarks (Danish) Radio). All radio broadcasts from DR in the period 1989 to 2005 have been digitised. The broadcasts were recorded on 2 hour tapes. In order not to loose any data, one tape was put in one recorder and a few minutes before it reached the end, another recorder was started with another tape. This yields two tapes with some overlap. All these tapes have now been digitized and the digitized dataset is 20Tbytes of audio files in 2 hour chunks with unknown overlaps; the library wishes to remove this overlap. The xcorrSound tool finds the overlap.
Migration and QA Scenario
The waveform-compare tool is used in SCAPE Solution SO4 Audio mp3 to wav Migration and QA Workflow.
Sound Match and Sound Index Scenarios