Welcome


to Philip Perry's AWK Music Page


The approach being followed at present is to use music scores in MIDI file format and then convert these to a textual form. This produces a text file which is little different from an event list found in most MIDI sequencers. It is best to generate the original MIDI file from a score-writing application so that the rhythmic data is exact - a human performer is unlikely to play with the mechanical precision required.

My 'home' operating system, the delightful RISC OS, has a well featured freeware MIDI analysing program - Midiphile. I have used this program to generate the base textual description of the music (MIDI file) upon which the following awk scripts operate. As an alternative, for users of others platforms, I have written the perl script mid2txt which will perform the same function. So wherever Midiphile is mentioned below, read mid2txt as required.

AWK scripts are used in a multi-stage process on the text output of Midiphile. Stage one is to prepare the data. That is, convert it to a convenient form for either melodic or harmonic analysis. Stage two then examines the prepared data with further awk scripts to extract the particular information required.

File Names, Paths, Etc..

Scripts dealing with melodic data are named 'melxx_awk' (where xx is a number) and there output will appear in the Current Working Directory with the name 'melxx_txt'. A similar scheme is used for harmonic data -i.e. 'har01_awk'. Output to a 'bare' filename in the CWD (current working directory) has been adopted as at least one (Windows) port of gawk I use has difficulty with full path names. Also, as my own 'home' OS uses the dot (period) as a directory separator, a forward slash as stand in for the dot filename-filetype separator and does not store the file type in the filename, the form '_awk' and '_txt' save the constant labour of switching back and forth between dot and underscore when moving between different OSs. If required these features could be changed by a few global search and replace operations on scripts downloaded to your own system.


Stage One - Preparing the Data

For illustrative purposes the first Prelude and Fugue from the 'Well-tempered Clavier' by J.S.Bach are used in this guide.

From the above score two MIDI files are exported, being sure that all of MIDI data for the Prelude are contained within one MIDI track (for harmonic analysis) and that the four voices of the Fugue each have their own MIDI tracks (for melodic analysis). These two MIDI files are next converted to a descriptive text format by the 'Midiphile' program, illustrated below.


Prelude No1 - Bars 1-4

Midiphile : 0.21 (13 Mar 2001)

MIDI file : ADFS::HardDisc4.$.Music.Sib7_Files.BachJS.P1_48/MID
Decoded   : Sat 28 Dec 2002. 11:16 PM
Tracks    : 0 and 1 (all data)

Events tagged :
 M : Meta event
 A : ANO (Ctrlr 123)
 P : Program Change
 N : Excess Note On
 F : Excess Note Off
 H : Hanging note

˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜

MThd (6 bytes)
 Format 1 (2 simultaneous tracks)
 Metrical time (TickDiv = 96 ppqn)

MTrk (18 bytes)        TRACK 0, Tempo track
   Bar:Bt:Tck  Event Type
 M 001:01:000  Meta  KeySig   C Maj
 M 001:01:000  Meta  TimeSig  4/4 (cc=24 bb=8)
 M 001:01:000  Meta  EOT
˜> Found 3 Meta events

MTrk (4432 bytes)      TRACK 1
   Bar:Bt:Tck  Event Type
   001:01:000  Midi  Ctrllr    1   0 (Bank Select; MSB) 0
   001:01:000  Midi  Ctrllr    1   32 (Bank Select; LSB) 0
 P 001:01:000  Midi  ProgChg   1   0
   001:01:000  Midi  Ctrllr    1   10 (Pan; MSB) 64
   001:01:000  Midi  Ctrllr    1   91 (Effects 1) 61
   001:01:000  Midi  NoteOn    1   C3   v:78
   001:01:024  Midi  NoteOn    1   E3   v:78
   001:01:048  Midi  NoteOn    1   G3   v:76
   001:01:072  Midi  NoteOn    1   G3   v:0
   001:01:072  Midi  NoteOn    1   C4   v:90
   001:02:000  Midi  NoteOn    1   C4   v:0
   001:02:000  Midi  NoteOn    1   E4   v:95
   001:02:024  Midi  NoteOn    1   E4   v:0
   001:02:024  Midi  NoteOn    1   G3   v:73
   001:02:048  Midi  NoteOn    1   G3   v:0
   001:02:048  Midi  NoteOn    1   C4   v:80
   001:02:072  Midi  NoteOn    1   C4   v:0
   001:02:072  Midi  NoteOn    1   E4   v:97
   001:03:000  Midi  NoteOn    1   E4   v:0
   001:03:000  Midi  NoteOn    1   E3   v:0
   001:03:000  Midi  NoteOn    1   C3   v:0
   001:03:000  Midi  NoteOn    1   C3   v:71
   001:03:024  Midi  NoteOn    1   E3   v:78
   001:03:048  Midi  NoteOn    1   G3   v:76
   001:03:072  Midi  NoteOn    1   G3   v:0
   001:03:072  Midi  NoteOn    1   C4   v:85
   001:04:000  Midi  NoteOn    1   C4   v:0
   001:04:000  Midi  NoteOn    1   E4   v:105
   001:04:024  Midi  NoteOn    1   E4   v:0
   001:04:024  Midi  NoteOn    1   G3   v:75
   001:04:048  Midi  NoteOn    1   G3   v:0
   001:04:048  Midi  NoteOn    1   C4   v:83
   001:04:072  Midi  NoteOn    1   C4   v:0
   001:04:072  Midi  NoteOn    1   E4   v:97
   002:01:000  Midi  NoteOn    1   E4   v:0
   002:01:000  Midi  NoteOn    1   E3   v:0
   002:01:000  Midi  NoteOn    1   C3   v:0
   002:01:000  Midi  NoteOn    1   C3   v:80
   002:01:024  Midi  NoteOn    1   D3   v:74
   002:01:048  Midi  NoteOn    1   A3   v:78
   002:01:072  Midi  NoteOn    1   A3   v:0
   002:01:072  Midi  NoteOn    1   D4   v:90
   002:02:000  Midi  NoteOn    1   D4   v:0
   002:02:000  Midi  NoteOn    1   F4   v:99
   002:02:024  Midi  NoteOn    1   F4   v:0
   002:02:024  Midi  NoteOn    1   A3   v:73
   002:02:048  Midi  NoteOn    1   A3   v:0
   002:02:048  Midi  NoteOn    1   D4   v:83
   002:02:072  Midi  NoteOn    1   D4   v:0
   002:02:072  Midi  NoteOn    1   F4   v:92
   002:03:000  Midi  NoteOn    1   F4   v:0
   002:03:000  Midi  NoteOn    1   D3   v:0
   002:03:000  Midi  NoteOn    1   C3   v:0
   002:03:000  Midi  NoteOn    1   C3   v:78
   002:03:024  Midi  NoteOn    1   D3   v:78
   002:03:048  Midi  NoteOn    1   A3   v:74
   002:03:072  Midi  NoteOn    1   A3   v:0
   002:03:072  Midi  NoteOn    1   D4   v:90
   002:04:000  Midi  NoteOn    1   D4   v:0
   002:04:000  Midi  NoteOn    1   F4   v:93
   002:04:024  Midi  NoteOn    1   F4   v:0
   002:04:024  Midi  NoteOn    1   A3   v:78
   002:04:048  Midi  NoteOn    1   A3   v:0
   002:04:048  Midi  NoteOn    1   D4   v:80
   002:04:072  Midi  NoteOn    1   D4   v:0
   002:04:072  Midi  NoteOn    1   F4   v:95
   003:01:000  Midi  NoteOn    1   F4   v:0
   003:01:000  Midi  NoteOn    1   D3   v:0
   003:01:000  Midi  NoteOn    1   C3   v:0
   003:01:000  Midi  NoteOn    1   B2   v:79
   003:01:024  Midi  NoteOn    1   D3   v:80
   003:01:048  Midi  NoteOn    1   G3   v:78
   003:01:072  Midi  NoteOn    1   G3   v:0
   003:01:072  Midi  NoteOn    1   D4   v:93
   003:02:000  Midi  NoteOn    1   D4   v:0
   003:02:000  Midi  NoteOn    1   F4   v:101
   003:02:024  Midi  NoteOn    1   F4   v:0
   003:02:024  Midi  NoteOn    1   G3   v:72
   003:02:048  Midi  NoteOn    1   G3   v:0
   003:02:048  Midi  NoteOn    1   D4   v:88
   003:02:072  Midi  NoteOn    1   D4   v:0
   003:02:072  Midi  NoteOn    1   F4   v:92
   003:03:000  Midi  NoteOn    1   F4   v:0
   003:03:000  Midi  NoteOn    1   D3   v:0
   003:03:000  Midi  NoteOn    1   B2   v:0
   003:03:000  Midi  NoteOn    1   B2   v:74
   003:03:024  Midi  NoteOn    1   D3   v:76
   003:03:048  Midi  NoteOn    1   G3   v:74
   003:03:072  Midi  NoteOn    1   G3   v:0
   003:03:072  Midi  NoteOn    1   D4   v:84
   003:04:000  Midi  NoteOn    1   D4   v:0
   003:04:000  Midi  NoteOn    1   F4   v:101
   003:04:024  Midi  NoteOn    1   F4   v:0
   003:04:024  Midi  NoteOn    1   G3   v:77
   003:04:048  Midi  NoteOn    1   G3   v:0
   003:04:048  Midi  NoteOn    1   D4   v:86
   003:04:072  Midi  NoteOn    1   D4   v:0
   003:04:072  Midi  NoteOn    1   F4   v:88
   004:01:000  Midi  NoteOn    1   F4   v:0
   004:01:000  Midi  NoteOn    1   D3   v:0
   004:01:000  Midi  NoteOn    1   B2   v:0
   004:01:000  Midi  NoteOn    1   C3   v:83
   004:01:024  Midi  NoteOn    1   E3   v:80
   004:01:048  Midi  NoteOn    1   G3   v:80
   004:01:072  Midi  NoteOn    1   G3   v:0
   004:01:072  Midi  NoteOn    1   C4   v:85
   004:02:000  Midi  NoteOn    1   C4   v:0
   004:02:000  Midi  NoteOn    1   E4   v:99
   004:02:024  Midi  NoteOn    1   E4   v:0
   004:02:024  Midi  NoteOn    1   G3   v:78
   004:02:048  Midi  NoteOn    1   G3   v:0
   004:02:048  Midi  NoteOn    1   C4   v:80
   004:02:072  Midi  NoteOn    1   C4   v:0
   004:02:072  Midi  NoteOn    1   E4   v:97
   004:03:000  Midi  NoteOn    1   E4   v:0
   004:03:000  Midi  NoteOn    1   E3   v:0
   004:03:000  Midi  NoteOn    1   C3   v:0
   004:03:000  Midi  NoteOn    1   C3   v:85
   004:03:024  Midi  NoteOn    1   E3   v:78
   004:03:048  Midi  NoteOn    1   G3   v:76
   004:03:072  Midi  NoteOn    1   G3   v:0
   004:03:072  Midi  NoteOn    1   C4   v:87
   004:04:000  Midi  NoteOn    1   C4   v:0
   004:04:000  Midi  NoteOn    1   E4   v:101
   004:04:024  Midi  NoteOn    1   E4   v:0
   004:04:024  Midi  NoteOn    1   G3   v:78
   004:04:048  Midi  NoteOn    1   G3   v:0
   004:04:048  Midi  NoteOn    1   C4   v:83
   004:04:072  Midi  NoteOn    1   C4   v:0
   004:04:072  Midi  NoteOn    1   E4   v:97
   005:01:000  Midi  NoteOn    1   E4   v:0
   005:01:000  Midi  NoteOn    1   E3   v:0
   005:01:000  Midi  NoteOn    1   C3   v:0


Fugue, Upper Voice - Bars 1-6

[Without file header]
MTrk (2038 bytes)      TRACK 1
   Bar:Bt:Tck  Event Type
   001:01:000  Midi  Ctrllr    1   7 (Volume; MSB) 80
   001:01:000  Midi  Ctrllr    1   1 (Modulation; MSB) 8
   001:01:000  Midi  Ctrllr    1   91 (Effects 1) 48
   001:01:000  Midi  Ctrllr    1   93 (Effects 3) 0
   001:01:000  Midi  Ctrllr    1   11 (Expression; MSB) 64
   001:03:048  Midi  Ctrllr    1   0 (Bank Select; MSB) 0
   001:03:048  Midi  Ctrllr    1   32 (Bank Select; LSB) 0
 P 001:03:048  Midi  ProgChg   1   74
   001:03:048  Midi  Ctrllr    1   10 (Pan; MSB) 56
   001:03:048  Midi  Ctrllr    1   91 (Effects 1) 61
   002:03:048  Midi  NoteOn    1   G3   v:80
   002:04:000  Midi  NoteOn    1   G3   v:0
   002:04:000  Midi  NoteOn    1   A3   v:85
   002:04:048  Midi  NoteOn    1   A3   v:0
   002:04:048  Midi  NoteOn    1   B3   v:83
   003:01:000  Midi  NoteOn    1   B3   v:0
   003:01:000  Midi  NoteOn    1   C4   v:83
   003:01:072  Midi  NoteOn    1   C4   v:0
   003:01:072  Midi  NoteOn    1   D4   v:85
   003:01:084  Midi  NoteOn    1   D4   v:0
   003:01:084  Midi  NoteOn    1   C4   v:85
   003:02:000  Midi  NoteOn    1   C4   v:0
   003:02:000  Midi  NoteOn    1   B3   v:83
   003:02:048  Midi  NoteOn    1   B3   v:0
   003:02:048  Midi  NoteOn    1   E4   v:78
   003:03:000  Midi  NoteOn    1   E4   v:0
   003:03:000  Midi  NoteOn    1   A3   v:88
   003:03:048  Midi  NoteOn    1   A3   v:0
   003:03:048  Midi  NoteOn    1   D4   v:75
   003:04:024  Midi  NoteOn    1   D4   v:0
   003:04:024  Midi  NoteOn    1   E4   v:83
   003:04:048  Midi  NoteOn    1   E4   v:0
   003:04:048  Midi  NoteOn    1   D4   v:85
   003:04:072  Midi  NoteOn    1   D4   v:0
   003:04:072  Midi  NoteOn    1   C4   v:83
   004:01:000  Midi  NoteOn    1   C4   v:0
   004:01:000  Midi  NoteOn    1   B3   v:85
   004:01:024  Midi  NoteOn    1   B3   v:0
   004:01:024  Midi  NoteOn    1   G3   v:78
   004:01:048  Midi  NoteOn    1   G3   v:0
   004:01:048  Midi  NoteOn    1   A3   v:85
   004:01:072  Midi  NoteOn    1   A3   v:0
   004:01:072  Midi  NoteOn    1   B3   v:78
   004:02:000  Midi  NoteOn    1   B3   v:0
   004:02:000  Midi  NoteOn    1   C4   v:83
   004:02:024  Midi  NoteOn    1   C4   v:0
   004:02:024  Midi  NoteOn    1   B3   v:83
   004:02:048  Midi  NoteOn    1   B3   v:0
   004:02:048  Midi  NoteOn    1   C4   v:83
   004:02:072  Midi  NoteOn    1   C4   v:0
   004:02:072  Midi  NoteOn    1   D4   v:75
   004:03:000  Midi  NoteOn    1   D4   v:0
   004:03:000  Midi  NoteOn    1   E4   v:88
   004:03:024  Midi  NoteOn    1   E4   v:0
   004:03:024  Midi  NoteOn    1   D4   v:80
   004:03:048  Midi  NoteOn    1   D4   v:0
   004:03:048  Midi  NoteOn    1   E4   v:80
   004:03:072  Midi  NoteOn    1   E4   v:0
   004:03:072  Midi  NoteOn    1   F#4   v:78
   004:04:000  Midi  NoteOn    1   F#4   v:0
   004:04:000  Midi  NoteOn    1   G4   v:90
   004:04:048  Midi  NoteOn    1   G4   v:0
   004:04:048  Midi  NoteOn    1   B3   v:83
   005:01:000  Midi  NoteOn    1   B3   v:0
   005:01:000  Midi  NoteOn    1   C4   v:85
   005:01:048  Midi  NoteOn    1   C4   v:0
   005:01:048  Midi  NoteOn    1   A3   v:83
   005:02:000  Midi  NoteOn    1   A3   v:0
   005:02:000  Midi  NoteOn    1   D4   v:83
   005:02:024  Midi  NoteOn    1   D4   v:0
   005:02:024  Midi  NoteOn    1   C4   v:78
   005:02:048  Midi  NoteOn    1   C4   v:0
   005:02:048  Midi  NoteOn    1   B3   v:83
   005:02:072  Midi  NoteOn    1   B3   v:0
   005:02:072  Midi  NoteOn    1   A3   v:83
   005:03:000  Midi  NoteOn    1   A3   v:0
   005:03:000  Midi  NoteOn    1   G3   v:85
   005:03:072  Midi  NoteOn    1   G3   v:0
   005:03:072  Midi  NoteOn    1   G3   v:78
   005:04:000  Midi  NoteOn    1   G3   v:0
   005:04:000  Midi  NoteOn    1   F3   v:83
   005:04:024  Midi  NoteOn    1   F3   v:0
   005:04:024  Midi  NoteOn    1   E3   v:80
   005:04:048  Midi  NoteOn    1   E3   v:0
   005:04:048  Midi  NoteOn    1   F3   v:78
   005:04:072  Midi  NoteOn    1   F3   v:0
   005:04:072  Midi  NoteOn    1   G3   v:83
   006:01:000  Midi  NoteOn    1   G3   v:0
   006:01:000  Midi  NoteOn    1   A3   v:83
   006:01:024  Midi  NoteOn    1   A3   v:0
   006:01:024  Midi  NoteOn    1   G3   v:80
   006:01:048  Midi  NoteOn    1   G3   v:0
   006:01:048  Midi  NoteOn    1   A3   v:78
   006:01:072  Midi  NoteOn    1   A3   v:0
   006:01:072  Midi  NoteOn    1   B3   v:78
   006:02:000  Midi  NoteOn    1   B3   v:0
   006:02:000  Midi  NoteOn    1   C4   v:88
   006:04:000  Midi  NoteOn    1   C4   v:0
   006:04:000  Midi  NoteOn    1   B3   v:83
   007:01:000  Midi  NoteOn    1   B3   v:0

If you don't have access to a RISC OS computer and Midiphile's facilities you may need to manipulate your data into a form similar to the above (perhaps with 'mid2txt_pl' and 'm2t2t_awk' also suplied here) prior to using the melodic and harmonic AWK scripts described below. But first, the next section covers a few utility scripts which you may or may not need to use.


Utility Scripts


mid2txt

Converts a MIDI file into textual description in either score list (default) or event list format.

Usage: [perl] mid2txt [-e -h] input-MIDI-file

Output is to STDOUT, which can be redirected to file, for example:
mid2txt -e input-MIDI-file > output-text-file
would send an event list to the output file specified. By default a score list is produced (note starttime + duration) this can be changed to an event list of note_on and note_off commands with the -e switch.

This script was originally written (2003, version 0.01) as a transportable replacement for the RISC OS dependent program Midiphile. It requires Perl 5.001 or above and the services of the MIDI-perl modules written by Sean Burke - freely available from CPAN. Once it has converted your MIDI file to text format, the analytical AWK scripts (below) can then be set to work on the data. The textual output from mid2txt is in a form determined by the MIDI-perl modules which while containing all the information needed, requires further processing by 'm2t2t_awk' to make it conform with Midiphile's list format used by the awk scripts.

However, now (2005, version 0.02) mid2txt also functions as the 'doorway' to a set of perl scripts which manipulate the script's text-list output to achieve a number of further ends.

Here is an exerpt from a mid2txt file:

MIDI file output in plain text:
 MThd (6 bytes)
 Format 1 (2 tracks)
 Metrical time (TickDiv = 96 ppqn)

Track: 0 
@notes = (   # 8 notes...
 ['key_signature', 0, 2, 0],
 ['time_signature', 0, 2, 2, 24, 8],
 ['set_tempo', 0, 631578],
 ['set_tempo', 4656, 638297],
 ['set_tempo', 192, 645161],
 ['set_tempo', 192, 689655],
 ['set_tempo', 192, 674157],
 ['set_tempo', 192, 705882],
);

Track: 1 
@notes = (   # 347 notes...
 ['control_change', 1536, 1, 0, 0],
 ['control_change', 0, 1, 32, 0],
 ['patch_change', 0, 1, 54],
 ['control_change', 0, 1, 10, 0],
 ['control_change', 0, 1, 91, 61],
 ['note_on', 0, 1, 69, 96],
 ['note_on', 48, 1, 69, 0],
 ['note_on', 0, 1, 74, 102],
 ['note_on', 96, 1, 74, 0],
 ['note_on', 0, 1, 73, 90],
 ['note_on', 96, 1, 73, 0],
 ['note_on', 0, 1, 74, 100],
 ['note_on', 48, 1, 74, 0],
 ['note_on', 0, 1, 73, 94],
 ['note_on', 24, 1, 73, 0],
 ['note_on', 0, 1, 71, 100],
 ['note_on', 24, 1, 71, 0],
 ['note_on', 0, 1, 69, 96],
 ['note_on', 48, 1, 69, 0],
 ['note_on', 0, 1, 71, 102],
 ['note_on', 48, 1, 71, 0],
 ['note_on', 0, 1, 69, 102],
 ['note_on', 24, 1, 69, 0],
 ['note_on', 0, 1, 74, 94],
 ['note_on', 24, 1, 74, 0],


m2t2t_awk

This script moulds the text output of mid2txt into the same format used by Midifile so that the data can be further processed by the harmonic and melodic awk scripts detailed below. Output is to 'm2t2t_txt' in the CWD.

Usage: [g]awk [-v ticksInBar=n -v ticksInBeat=n -v ppqn=n -v readTimeSig=n] -f m2t2t_awk mid2txt_txt

Default 4/4 time, 96 ppqn.

If 'm2t2t_awk' finds time signature and tick division data in the input file it will use this information to override commandline arguments and its own internal defaults. So for the most part no variable values need be given - 'm2t2t_awk finds the info for itself.

At present the script will only apply a single time signature - the first, at position tick 0 - but does register and printout others found. However, if the variable 'readTimeSig' is set to zero then the script uses the commandline or default time implied by the values of 'ticksInBar' and 'ticksInBeat'. This feature could be used to re-bar / change time signature. Also, the script expects to find only one track with note event data, that is either a Type 0 MIDI file or a Type 1 file with a Track 0 tempo map and Track 1 note events.

The output file 'm2t2t_txt' is essentially the same as the output from Midiphile and can be used with 'mel01_awk' and 'har01_awk'.


sibfix_awk

Usage: [g]awk -f sibfix_awk inputfile

The RISC OS versions of Sibelius, Sibelius6 and Sibelius7 have a 'feature' that can produce two NoteOns followed by two NoteOffs where notes are exchanged between voices on the same track. Midifile marks these up with a 'N' and 'O' flags. Where these occur in Midifile's text output 'sibfix_awk' will correct the text file - it is just the sequence of commands which is at fault. If the fault is not corrected there is likely to be difficulties with using subsquent scripts on the file. Where 'sibfix_awk' has moved a line into the correct order the use of single spaces between fields marks it out. The awk scripts will function on these lines just the same as the untouched ones.


General Usage Tips

Where scripts require option(s) to be set, this can be done from the command line (in the form: -v var=value) or by changing the values at the top of the BEGIN section within the script. The defaults are the values written in to the top of the BEGIN section, eg.

# Enter pattern on the command line, for example:
# gawk -v pattern="+2+2+1+2-2-1+5-7" -f mel05_awk mel02_txt
# NOTE: no spaces around the equals sign above...
# OR
# Enter pattern below eg. pattern = "+2+2+1+2-2-1+5-7" (spaces OK).


 BEGIN {(! pattern) { pattern = "+2+2+1+2-2-1+5-7" }

#------------DON'T ALTER BELOW THIS LINE---------------#


Stage Two - Preparing Melodic Data


mel01_awk

Usage: [g]awk [-v var=value] -f mel01_awk inputfile

This script takes input in the form of Midiphile's text analysis of a MIDI file, filters out unwanted data... and calculates the number of ticks for the start and end of each note event - which is appended to the input data and output as file 'mel01_txt'. So a snatch of the Fugue now looks like:
MIDI data  output by 'mel01_awk'
Text format suitable for 'mel02_awk'.

   002:03:048:768	 Midi NoteOn 1 G3 v:80
   002:04:000:840	 Midi NoteOn 1 G3 v:0
   002:04:000:840	 Midi NoteOn 1 A3 v:80
   002:04:048:888	 Midi NoteOn 1 A3 v:0
   002:04:048:888	 Midi NoteOn 1 B3 v:80
   003:01:000:960	 Midi NoteOn 1 B3 v:0
   003:01:000:960	 Midi NoteOn 1 C4 v:80
   003:01:072:1032	 Midi NoteOn 1 C4 v:0
   003:01:072:1032	 Midi NoteOn 1 D4 v:80
   003:01:084:1044	 Midi NoteOn 1 D4 v:0
   003:01:084:1044	 Midi NoteOn 1 C4 v:80
   003:02:000:1080	 Midi NoteOn 1 C4 v:0
   003:02:000:1080	 Midi NoteOn 1 B3 v:80
   003:02:048:1128	 Midi NoteOn 1 B3 v:0
   003:02:048:1128	 Midi NoteOn 1 E4 v:80
   003:03:000:1200	 Midi NoteOn 1 E4 v:0
   003:03:000:1200	 Midi NoteOn 1 A3 v:80
   003:03:048:1248	 Midi NoteOn 1 A3 v:0
   003:03:048:1248	 Midi NoteOn 1 D4 v:80
   003:04:024:1344	 Midi NoteOn 1 D4 v:0
   003:04:024:1344	 Midi NoteOn 1 E4 v:80
   003:04:048:1368	 Midi NoteOn 1 E4 v:0
   003:04:048:1368	 Midi NoteOn 1 D4 v:80
   003:04:072:1392	 Midi NoteOn 1 D4 v:0
   003:04:072:1392	 Midi NoteOn 1 C4 v:80
   004:01:000:1440	 Midi NoteOn 1 C4 v:0

Time Signature(s) found at:
001:01:000 4/4

The script does allow you to specify how many ticks in a bar and a beat in the BEGIN section or on as command line options. However, if 'mel01_awk' finds time signature and tick division data in the input file it will use this information to override commandline arguments and its own internal defaults. So for the most part no variable values need be given - 'mel01_awk' finds the info for itself. At present the script will only apply a single time signature - the first, at position tick 0 - but does register and printout others found. If the variable 'readTimeSig' is set to zero then the script uses the commandline or default time signature implied by the values of 'ticksInBar' and 'ticksInBeat'. This feature could be used to re-bar / change time signature. The defaults are:

BEGIN {
  if (! ticksInBar) { ticksInBar = 384 }
  if (! ticksInBeat) { ticksInBeat = 96 }
  if (! ppqn) { ppqn = 96 }
  if (! readTimeSig) { readTimeSig = 1 }
  #Default: 96 ticks/quarternote, 4/4 time sig.

(OR on the command line)

gawk -v ticksInBar=384 -v ticksInBeat=96 -v ppqn=96 -f mel01_awk inputfile

mel02_awk

Usage: [g]awk -f mel02_awk mel01_txt

This script takes 'mel01_txt' as input and calculates the intervals between each note in semitones, expressing rising intervals as '+x' and falling intervals as '-x'. It assumes a starting pitch of middle C for the first interval and expresses rests as '-0' and repeated note intervals as '+0'. Two notes separated by a rest (eg. B, rest, C) will have their interval spread across the rest (eg. B, -0, +1). The above snatch (plus a few more notes) of data is converted to:

 
   001:01:000:0	        -0	REST	768
   002:03:048:768	+7	G3	72
   002:04:000:840	+2	A3	48
   002:04:048:888	+2	B3	72
   003:01:000:960	+1	C4	72
   003:01:072:1032	+2	D4	12
   003:01:084:1044	-2	C4	36
   003:02:000:1080	-1	B3	48
   003:02:048:1128	+5	E4	72
   003:03:000:1200	-7	A3	48
   003:03:048:1248	+5	D4	96
   003:04:024:1344	+2	E4	24
   003:04:048:1368	-2	D4	24
   003:04:072:1392	-2	C4	48
   004:01:000:1440	-1	B3	24
   004:01:024:1464	-4	G3	24
   004:01:048:1488	+2	A3	24
   004:01:072:1512	+2	B3	48
   004:02:000:1560	+1	C4	24
   004:02:024:1584	-1	B3	24
   004:02:048:1608	+1	C4	24
   004:02:072:1632	+2	D4	48
   004:03:000:1680	+2	E4	24
   004:03:024:1704	-2	D4	24
   004:03:048:1728	+2	E4	24
   004:03:072:1752	+2	F#4	48
   004:04:000:1800	+1	G4	48
   004:04:048:1848	-8	B3	72
   005:01:000:1920	+1	C4	48
Displaying melodic data as a succession of plus and minus intervals as above provides a convenient format for further analysis. Identifiable themes and motives (eg. +2+2+1+2-2-1+5-7) which remain constant through transpositions and modulations can be searched for, perhaps with the addition of 'wildcards' (regular expressions) to find variations and transformations. This analysis can also take account of durations if required by using the 'tick' information in the fourth column from the left.

Also the data can be thoroughly sifted to produce a profile of all melodic combinations - which is precisely what the next script does.

mel03_awk

Usage: [g]awk [-v var=value] -f mel03_awk mel02_txt

This script takes as input 'mel02_txt' and catalogues every melodic pattern present in the file... or patterns up to a configurable pattern length.

If all combinations are being catalogued this can require substantial memory and cpu resources. Therefore the default pattern length is set at 12. For example, a melody of 250 notes can produce an output file of 5MB if all combinations are printed out. To change the maximum length of patterns searched for, set the 'patternlength' variable to the required number... or to the string "no_limit" to search every single pattern ranging from the whole melody down.

By default only patterns occurring two or more times are output (to 'mel03_txt'). Change the variable 'occurrences = 2' to '1' to see all combination of notes in the output file.

# Change 'occurrences = x' to screen out occurrences below 'x'.
# To limit 'patternlength' give it the desired length or assign
# the string "no_limit" -i.e. all combinations are catalogued,
# which may take some time. Default is 12 notes.

 BEGIN { if (! occurrences) { occurrences = 2 }
         if (! patternlength) { patternlength = 12 }
       }

#-----DO NOT ALTER BELOW THIS LINE-----#

(OR on the command line)

gawk -v occurrences=2 -v patternlength=12 -f mel03_awk mel02_txt

Here are the first few lines of 'mel03_txt' when applied to the first voice of the fugue with the above settings:
   5  of  +2+2+1+2-2-1
   3  of  -1+5-7+5+2-2-2-1
   5  of  +5+2
   5  of  +5+2-2
   3  of  +2-2-1+5-7+5+2-2-2
   9  of  +1+2+2
   2  of  -5+2+2
   5  of  -2-1-2
   5  of  -2+2+2
   3  of  -2-1+5-7+5+2-2-2-1
   3  of  +1+2-2-1+5-7+5+2-2-2-1
   9  of  +2+1+2

mel04_awk

Usage: [g]awk [-v var=value] -f mel04_awk mel02_txt

This little script takes as input 'mel02_txt' and calculates the number and type of intervals - in semitones. Leaps greater than an octave are reduced to within an octave compass by default. Set -v compressto8ve="no" on the command line or alter the variable in the BEGIN section below to change this behaviour.

 BEGIN { if (! compressto8ve) { compressto8ve = "yes" }
       }
#-------------DO NOT ALTER BELOW THIS LINE------------#

The script also accumulates all plus & minus intervals in to a total. Running the script on the first voice of the fugue produces the file 'mel04_txt' with the following output:

File: mel04_txt, interval data.

  Number of +0	semitone intervals: 14
  Number of +1	semitone intervals: 33
  Number of +2	semitone intervals: 59
  Number of +3	semitone intervals: 1
  Number of -1	semitone intervals: 35
  Number of +4	semitone intervals: 1
  Number of -2	semitone intervals: 54
  Number of -3	semitone intervals: 4
  Number of +5	semitone intervals: 15
  Number of -4	semitone intervals: 1
  Number of -5	semitone intervals: 2
  Number of +7	semitone intervals: 2
  Number of -6	semitone intervals: 1
  Number of +8	semitone intervals: 2
  Number of -7	semitone intervals: 7
  Number of -8	semitone intervals: 1

Accumulated plus & minus interval = 31

mel05_awk

Usage: [g]awk [-v var=value] -f mel05_awk mel02_txt

Takes mel02.txt as input and searches the file for a given (literal) melodic pattern. This script does NOT support regular expressions (use mel06.awk which does). The output file is 'mel05.txt' in current working directory.

 
# Enter pattern on the command line, for example:
# gawk -v pattern="+2+2+1+2-2-1+5-7" -f mel05.awk mel02.txt
# NOTE: no spaces around the equals sign above...
# OR
# Enter pattern below eg. pattern = "+2+2+1+2-2-1+5-7" (spaces OK).

 BEGIN { if (! pattern) { pattern = "+2+2+1+2-2-1+5-7" }

#------------DON'T ALTER BELOW THIS LINE---------------#


Running mel05_awk on the mel02_txt which contains data from the first voice of the fugue, returns:

File: mel05.txt (output by script mel05.awk)

Match found, starting at: 002:04:000 +2 A3	48	672
Pattern found: +2+2+1+2-2-1+5-7

Match found, starting at: 007:02:000 +2 D4	48	2400
Pattern found: +2+2+1+2-2-1+5-7

Match found, starting at: 016:03:000 +2 D4	48	5952
Pattern found: +2+2+1+2-2-1+5-7

Match found, starting at: 021:01:000 +2 A3	48	7680
Pattern found: +2+2+1+2-2-1+5-7

mel06_awk

Usage: [g]awk [--re-interval] [-v var=value] -f mel06_awk mel02_txt

Takes mel02.txt as input and searches the file for a range of melodic patterns described by a regular expression, written as a string contained in the variable 'pattern'. The output file is 'mel05.txt' in current working directory. To use this script some knowledge of regexs is needed, the gawk manual is the place to look for guidance.

# Takes mel02.txt as input.
# Outputs results to file 'mel06.txt' in current working directory.
# The input file is searched for the given melodic pattern like
# 'mel05.awk', however 'pattern' uses regular expressions rather than
# a literal string. This provides a more powerful and refined search
# facility but requires some knowledge of regexs. (Don't forget '\\+'
# and '\\-' for plus & minus in a string but not in the char class.
# or --re-interval for interval expressions)
# Enter pattern on the command line, for example:
# gawk -v pattern="[-+12]{0 ,12}\\+5\\-7" -f mel06.awk mel02.txt
# NOTE: no spaces around the equals sign above...
# OR... Enter pattern below:
# Eg. pattern = "[-+12]{0 ,12}\\+5\\-7" (spaces OK).

 BEGIN {(! pattern) {= "[-+12]{0 ,12}\\+5\\-7" }

#------------------DON'T ALTER BELOW THIS LINE---------------------#


Running mel06_awk on the mel02_txt which contains data from the first voice of the fugue, returns one extra match:

File: mel06.txt (output by script mel06.awk)

Match found, starting at: 002:04:000 +2 A3	48	672
Pattern found: +2+2+1+2-2-1+5-7

Match found, starting at: 007:02:000 +2 D4	48	2400
Pattern found: +2+2+1+2-2-1+5-7

Match found, starting at: 016:03:000 +2 D4	48	5952
Pattern found: +2+2+1+2-2-1+5-7

Match found, starting at: 017:03:024 +2 A4	24	6360
Pattern found: +2-2-2-1+5-7

Match found, starting at: 021:01:000 +2 A3	48	7680
Pattern found: +2+2+1+2-2-1+5-7



Stage Two - Preparing Harmonic Data

Scripts dealing with harmonic data are named 'harxx_awk' (where xx is a number) and there output will appear in the Current Working Directory with the name 'harxx_txt'. A similar scheme is used for melodic data -i.e. 'mel01_awk'.

har01_awk / har01p_awk

Usage: [g]awk [-v var=value] -f har01[p]_awk inputfile

This script processes MIDI data taken from RISC OS Midiphile's text output - to prepare it for harmonic analysis by the script 'har02_awk'. Input takes the form:

 002:01:000  Midi  NoteOn    2   C3   v:80

and the music should have all note data held within one MIDI track. Output is to the file 'har01_txt' or 'har01p_txt in the cwd.

It may help to look at the way this script sifts its input data - which is a (text) list of note-on and note-off events, in time order, presented by Midiphile's output. 'har01_awk' reads these events and constructs a running list of 'notes currently sounding'. At a regular interval set by the 'pollincrement' variable this running list (of MIDI note numbers) is written out to file. The pollincrement is set in MIDI ticks - eg. at 96 pulses per quarter note, a pollincrement of 48 would mean polling every quaver and 192 at minim intervals.

p is for Pedal!

By using the 'har01p_awk variant of the script the running list of 'notes currently sounding' gains the facility of a piano's sustaining pedal for the length of the pollincrement. This is used for 'scooping up' arpeggi, broken or spread chords. For example, Bach's Prelude No.1 requires a pollincrement of 192 used with the 'har01p_awk' script to yield useful information. To some extent which variant of the script and the pollincrement variable are interdependent and each should be set with an eye to the other... and most importantly, with an eye to the texture of the score.

Variables which can be set at the top of the BEGIN section or from the command line are:


When har01p_awk is run with Midiphile's Prelude No.1 text output - quoted near the beginning of this document, with the command:

[g]awk -v pollincrement=192 -f har01_awk prelude1

the result is a list of time related MIDI note numbers in the form:

Bar:Beat:Ticks:AccumulatedTicks (Tab) List-of-Note-Numbers

For example:
002:03:000:576	 60 64 67 72 76
003:01:000:768	 60 64 67 72 76
003:03:000:960	 60 62 69 74 77
004:01:000:1152	 60 62 69 74 77
004:03:000:1344	 59 62 67 74 77
005:01:000:1536	 59 62 67 74 77
005:03:000:1728	 60 64 67 72 76
006:01:000:1920	 60 64 67 72 76
The above data represents an intermediate stage requiring further processing by 'har02_awk'. A batch file could be used to automate the working of 'har01_awk' and 'har02_awk'.

har02_awk

The data in 'har01_txt' above, reading the MIDI note numbers from left to right is just another way of writing the chords - Cmaj, Dmin (or Fmaj), Gmaj, Cmaj - or - I, II(iii)/IV(ii), V(i), I. This is precisely what the script 'har02_awk' attempts to do.

However, given that the ambiguity in the second bar is only the tip of the iceberg of uncertainty in regard to harmonic analysis. (Is it D minor over a pedal C or F major second inversion with an added sixth?) The script works by seeking out every possible harmonic interpretation of each line of note numbers. Some lines, like the first two are clear cut - C major and some like the second two have more than one interpretation... Which is the right interpretation I leave up to the user. :-) Or even, is there a right interpretation?

Running 'har02_awk' over the output file 'har01_txt', produces the following:

002:03:000:576	 60 64 67 72 76	 :C Major
003:01:000:768	 60 64 67 72 76	 :C Major
003:03:000:960	 60 62 69 74 77	 :F Major(ii) +maj6 :D minor(iii) +min7
004:01:000:1152	 60 62 69 74 77	 :F Major(ii) +maj6 :D minor(iii) +min7
004:03:000:1344	 59 62 67 74 77	 :G Major(i) +min7
005:01:000:1536	 59 62 67 74 77	 :G Major(i) +min7
005:03:000:1728	 60 64 67 72 76	 :C Major
006:01:000:1920	 60 64 67 72 76	 :C Major
006:03:000:2112	 60 64 69 76 81	 :A minor(i)
007:01:000:2304	 60 64 69 76 81	 :A minor(i)
007:03:000:2496	 60 62 66 69 74	 :D Major(iii) +min7
008:01:000:2688	 60 62 66 69 74	 :D Major(iii) +min7
008:03:000:2880	 59 62 67 74 79	 :G Major(i)
009:01:000:3072	 59 62 67 74 79	 :G Major(i)
009:03:000:3264	 59 60 64 67 72	 :E minor(ii) +min6 :C Major(iii) +maj7
010:01:000:3456	 59 60 64 67 72	 :E minor(ii) +min6 :C Major(iii) +maj7
010:03:000:3648	 57 60 64 67 72	 :A minor +min7 :C Major +maj6
011:01:000:3840	 57 60 64 67 72	 :A minor +min7 :C Major +maj6
011:03:000:4032	 50 57 62 66 72	 :D Major +min7
012:01:000:4224	 50 57 62 66 72	 :D Major +min7
012:03:000:4416	 55 59 62 67 71	 :G Major
013:01:000:4608	 55 59 62 67 71	 :G Major
013:03:000:4800	 55 58 64 67 73	 :G dim
014:01:000:4992	 55 58 64 67 73	 :G dim
014:03:000:5184	 53 57 62 69 74	 :D minor(i)
015:01:000:5376	 53 57 62 69 74	 :D minor(i)
015:03:000:5568	 53 56 62 65 71	 :F dim
016:01:000:5760	 53 56 62 65 71	 :F dim
016:03:000:5952	 52 55 60 67 72	 :C Major(i)
017:01:000:6144	 52 55 60 67 72	 :C Major(i)

Having arrived at this stage with all (or most) of the harmonic progressions of the piece visible, it is now possible to search and sift the data for chord patterns or sequences.


har03_awk

A script to search for chord patterns or sequences and modulations of such patterns. Which I have yet to write :-(


har04_awk

A script which can determine and trace the tonal centre (keys/modulations) of a piece - that is, rather than list every possible chord as in 'har02_awk', a script that will express an opinion as to the sequence of harmonies that forms the underlying structure. Which I have yet to write :-(


har05_awk

This script takes input from 'har01_txt' files which it searches for patterns of notes which could be used to write a canon to fit the harmonic sequence represented by the input file.

Usage [g]awk [-v pitch=n -v time=u] -f har05_awk har01_txt

Here is a snatch of the output for the Prelude:
Output from har05_awk
Pitch interval between voices is 7 semitones
Time lapse between voices is 2 units

001:01:000:0	CEG
001:03:000:192	CEG
002:01:000:384	DCA
002:03:000:576	DCA
003:01:000:768	DBG
003:03:000:960	DBG
004:01:000:1152	EG
004:03:000:1344	EG
005:01:000:1536	EA
005:03:000:1728	EA
006:01:000:1920	ADF#
006:03:000:2112	ADF#
007:01:000:2304	DGB
007:03:000:2496	DGB
008:01:000:2688	GBE
008:03:000:2880	GBE
The notes listed in 'har05_txt' above are the ones which can be used in the first voice at the Bar:Beat:Tick position listed which when transposed by 'pitch' and 'time' will form the second voice and fit the harmonies of their position. Output to 'har05_txt' is appended to file so that a number of different pitches and times can be tried out to find the best one... and for comparision.

And here is a canon written with the help of this script on the Prelude No.1 harmonies. Canon in MIDI file format. This (feeble) attempt at a canon, is included for illustrative purposes only!



E-Messages

My.electronic@mail.com.address.is.embedded.somewherepjperry@freeuk.comhere.and.is.not.dissimilar@to.the.www.domain
Apologies for this inconvenience.


Click here to download the above music analysis scripts and a copy of this page: awk scripts in a ZIP archive (20Kb).

Click here to download the awk HTML sorting script, entity mapping file and brief instructions.



Created 28Jan2003
Updated 10Dec2007