Data
- Total ~138K Four-Four Time bars
- Scale degree (considered as C Major scale)
- Total ~160K Four-Four Time bars
- Traspose all songs to C Major/A minor scale
Training Data
- Use symbolic timing, which discards tempo information
(see here for more
details)
- Discard velocity information (using binary-valued piano-rolls)
- 84 possibilities for note pitch (from C1 to B7)
- Merge tracks into 5 categories: Bass, Drums, Guitar, Piano and
Strings
- Consider only songs with an rock tag
- Collect musically meaningful 4-bar phrases for the temporal model by
segmenting the piano-rolls with structure features proposed in [1]
Hence, the size of the target output tensor is 4 (bar) × 96 (time step)
× 84 (pitch) × 5 (track).
- tra_phr.npy
(7.54 GB) contains 50,266 four-bar phrases. The shape is (50266, 384, 84, 5).
- tra_bar.npy
(4.79 GB) contains 127,734 bars. The shape is (127734, 96, 84, 5).
Here are two examples of five-track piano-roll of four-bar long seen in our
training data. The tracks are (from top to bottom): Bass, Drums,
Guitar, Strings, Piano.
Reference
- Joan Serrá, Meinard Müller, Peter Grosche and Josep Ll. Arcos,
“Unsupervised Detection of Music Boundaries by Time Series Structure
Features,”
in AAAI Conference on Artificial Intelligence (AAAI), 2012