XcorrSound

From COPTR
Revision as of 15:07, 21 April 2021 by Prwheatley (talk | contribs)
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to navigation Jump to search




The xcorrSound package compares sound waves using cross correlation.
Homepage:https://github.com/openplanets/scape-xcorrsound
License:GPL-2.0
Function:De-Duplication,Quality Assurance
Content type:Audio


Error in widget Ohloh Project: unable to write file /var/www/html/extensions/Widgets/compiled_templates/wrt6621124dc277f3_46775344


Description[edit]

The xcorrSound tool compares sound waves using cross correlation and finds the best overlap. The package contains several tools.

  • overlap-analysis (earlier xcorrSound) is a tool to find the overlap between two audio files. It outputs best sample number and second and value of match.
  • sound-match is a tool to find all occurrences of a shorter wav within a larger wav
  • waveform-compare (earlier migration-qa) is a tool that splits two audio files into equal sized blocks (default 5 seconds) and outputs the correlation for each block (a_i,b_i), if a and b was the input
  • sound_index is a tool to build an index in which sound-match can find all occurrences of a shorter wav (still in development phase)

The tools all make use of cross correlation, which can be computed through the Fourier transform.

User Experiences[edit]

This tool was developed and used at The Danish State and University Library in the following scenarios.

Overlap Scenario[edit]

The State and University Library in Denmark possesses a large amount of digital audio files which are radio broadcasts from DR (Danmarks (Danish) Radio). All radio broadcasts from DR in the period 1989 to 2005 have been digitised. The broadcasts were recorded on 2 hour tapes. In order not to loose any data, one tape was put in one recorder and a few minutes before it reached the end, another recorder was started with another tape. This yields two tapes with some overlap. All these tapes have now been digitized and the digitized dataset is 20Tbytes of audio files in 2 hour chunks with unknown overlaps; the library wishes to remove this overlap. The xcorrSound tool finds the overlap.

Migration and QA Scenario[edit]

The waveform-compare tool is used in SCAPE Solution SO4 Audio mp3 to wav Migration and QA Workflow.

Sound Match and Sound Index Scenarios[edit]

  • TBD*

Development Activity[edit]