The approach being followed at present is to use music scores in MIDI file format and then convert these to a textual form. This produces a text file which is little different from an event list found in most MIDI sequencers. It is best to generate the original MIDI file from a score-writing application so that the rhythmic data is exact - a human performer is unlikely to play with the mechanical precision required.
My 'home' operating system, the delightful RISC OS, has a well
featured freeware MIDI analysing program - Midiphile. I
have used this program to generate the base textual description of the
music (MIDI file) upon which the following awk scripts operate. As an
alternative, for users of others platforms, I have written the perl
script mid2txt which will perform the same function. So wherever
Midiphile is mentioned below, read mid2txt as required.
AWK scripts are used in a multi-stage process on the text output of Midiphile. Stage one is to prepare the data. That is, convert it to a convenient form for either melodic or harmonic analysis. Stage two then examines the prepared data with further awk scripts to extract the particular information required.
Scripts dealing with melodic data are named 'melxx_awk' (where xx is a number) and there output will appear in the Current Working Directory with the name 'melxx_txt'. A similar scheme is used for harmonic data -i.e. 'har01_awk'. Output to a 'bare' filename in the CWD (current working directory) has been adopted as at least one (Windows) port of gawk I use has difficulty with full path names. Also, as my own 'home' OS uses the dot (period) as a directory separator, a forward slash as stand in for the dot filename-filetype separator and does not store the file type in the filename, the form '_awk' and '_txt' save the constant labour of switching back and forth between dot and underscore when moving between different OSs. If required these features could be changed by a few global search and replace operations on scripts downloaded to your own system.
For illustrative purposes the first Prelude and Fugue from the 'Well-tempered Clavier' by J.S.Bach are used in this guide.
From the above score two MIDI files are exported, being sure that all of MIDI data for the Prelude are contained within one MIDI track (for harmonic analysis) and that the four voices of the Fugue each have their own MIDI tracks (for melodic analysis). These two MIDI files are next converted to a descriptive text format by the 'Midiphile' program, illustrated below.
Midiphile : 0.21 (13 Mar 2001) MIDI file : ADFS::HardDisc4.$.Music.Sib7_Files.BachJS.P1_48/MID Decoded : Sat 28 Dec 2002. 11:16 PM Tracks : 0 and 1 (all data) Events tagged : M : Meta event A : ANO (Ctrlr 123) P : Program Change N : Excess Note On F : Excess Note Off H : Hanging note ˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜ MThd (6 bytes) Format 1 (2 simultaneous tracks) Metrical time (TickDiv = 96 ppqn) MTrk (18 bytes) TRACK 0, Tempo track Bar:Bt:Tck Event Type M 001:01:000 Meta KeySig C Maj M 001:01:000 Meta TimeSig 4/4 (cc=24 bb=8) M 001:01:000 Meta EOT ˜> Found 3 Meta events MTrk (4432 bytes) TRACK 1 Bar:Bt:Tck Event Type 001:01:000 Midi Ctrllr 1 0 (Bank Select; MSB) 0 001:01:000 Midi Ctrllr 1 32 (Bank Select; LSB) 0 P 001:01:000 Midi ProgChg 1 0 001:01:000 Midi Ctrllr 1 10 (Pan; MSB) 64 001:01:000 Midi Ctrllr 1 91 (Effects 1) 61 001:01:000 Midi NoteOn 1 C3 v:78 001:01:024 Midi NoteOn 1 E3 v:78 001:01:048 Midi NoteOn 1 G3 v:76 001:01:072 Midi NoteOn 1 G3 v:0 001:01:072 Midi NoteOn 1 C4 v:90 001:02:000 Midi NoteOn 1 C4 v:0 001:02:000 Midi NoteOn 1 E4 v:95 001:02:024 Midi NoteOn 1 E4 v:0 001:02:024 Midi NoteOn 1 G3 v:73 001:02:048 Midi NoteOn 1 G3 v:0 001:02:048 Midi NoteOn 1 C4 v:80 001:02:072 Midi NoteOn 1 C4 v:0 001:02:072 Midi NoteOn 1 E4 v:97 001:03:000 Midi NoteOn 1 E4 v:0 001:03:000 Midi NoteOn 1 E3 v:0 001:03:000 Midi NoteOn 1 C3 v:0 001:03:000 Midi NoteOn 1 C3 v:71 001:03:024 Midi NoteOn 1 E3 v:78 001:03:048 Midi NoteOn 1 G3 v:76 001:03:072 Midi NoteOn 1 G3 v:0 001:03:072 Midi NoteOn 1 C4 v:85 001:04:000 Midi NoteOn 1 C4 v:0 001:04:000 Midi NoteOn 1 E4 v:105 001:04:024 Midi NoteOn 1 E4 v:0 001:04:024 Midi NoteOn 1 G3 v:75 001:04:048 Midi NoteOn 1 G3 v:0 001:04:048 Midi NoteOn 1 C4 v:83 001:04:072 Midi NoteOn 1 C4 v:0 001:04:072 Midi NoteOn 1 E4 v:97 002:01:000 Midi NoteOn 1 E4 v:0 002:01:000 Midi NoteOn 1 E3 v:0 002:01:000 Midi NoteOn 1 C3 v:0 002:01:000 Midi NoteOn 1 C3 v:80 002:01:024 Midi NoteOn 1 D3 v:74 002:01:048 Midi NoteOn 1 A3 v:78 002:01:072 Midi NoteOn 1 A3 v:0 002:01:072 Midi NoteOn 1 D4 v:90 002:02:000 Midi NoteOn 1 D4 v:0 002:02:000 Midi NoteOn 1 F4 v:99 002:02:024 Midi NoteOn 1 F4 v:0 002:02:024 Midi NoteOn 1 A3 v:73 002:02:048 Midi NoteOn 1 A3 v:0 002:02:048 Midi NoteOn 1 D4 v:83 002:02:072 Midi NoteOn 1 D4 v:0 002:02:072 Midi NoteOn 1 F4 v:92 002:03:000 Midi NoteOn 1 F4 v:0 002:03:000 Midi NoteOn 1 D3 v:0 002:03:000 Midi NoteOn 1 C3 v:0 002:03:000 Midi NoteOn 1 C3 v:78 002:03:024 Midi NoteOn 1 D3 v:78 002:03:048 Midi NoteOn 1 A3 v:74 002:03:072 Midi NoteOn 1 A3 v:0 002:03:072 Midi NoteOn 1 D4 v:90 002:04:000 Midi NoteOn 1 D4 v:0 002:04:000 Midi NoteOn 1 F4 v:93 002:04:024 Midi NoteOn 1 F4 v:0 002:04:024 Midi NoteOn 1 A3 v:78 002:04:048 Midi NoteOn 1 A3 v:0 002:04:048 Midi NoteOn 1 D4 v:80 002:04:072 Midi NoteOn 1 D4 v:0 002:04:072 Midi NoteOn 1 F4 v:95 003:01:000 Midi NoteOn 1 F4 v:0 003:01:000 Midi NoteOn 1 D3 v:0 003:01:000 Midi NoteOn 1 C3 v:0 003:01:000 Midi NoteOn 1 B2 v:79 003:01:024 Midi NoteOn 1 D3 v:80 003:01:048 Midi NoteOn 1 G3 v:78 003:01:072 Midi NoteOn 1 G3 v:0 003:01:072 Midi NoteOn 1 D4 v:93 003:02:000 Midi NoteOn 1 D4 v:0 003:02:000 Midi NoteOn 1 F4 v:101 003:02:024 Midi NoteOn 1 F4 v:0 003:02:024 Midi NoteOn 1 G3 v:72 003:02:048 Midi NoteOn 1 G3 v:0 003:02:048 Midi NoteOn 1 D4 v:88 003:02:072 Midi NoteOn 1 D4 v:0 003:02:072 Midi NoteOn 1 F4 v:92 003:03:000 Midi NoteOn 1 F4 v:0 003:03:000 Midi NoteOn 1 D3 v:0 003:03:000 Midi NoteOn 1 B2 v:0 003:03:000 Midi NoteOn 1 B2 v:74 003:03:024 Midi NoteOn 1 D3 v:76 003:03:048 Midi NoteOn 1 G3 v:74 003:03:072 Midi NoteOn 1 G3 v:0 003:03:072 Midi NoteOn 1 D4 v:84 003:04:000 Midi NoteOn 1 D4 v:0 003:04:000 Midi NoteOn 1 F4 v:101 003:04:024 Midi NoteOn 1 F4 v:0 003:04:024 Midi NoteOn 1 G3 v:77 003:04:048 Midi NoteOn 1 G3 v:0 003:04:048 Midi NoteOn 1 D4 v:86 003:04:072 Midi NoteOn 1 D4 v:0 003:04:072 Midi NoteOn 1 F4 v:88 004:01:000 Midi NoteOn 1 F4 v:0 004:01:000 Midi NoteOn 1 D3 v:0 004:01:000 Midi NoteOn 1 B2 v:0 004:01:000 Midi NoteOn 1 C3 v:83 004:01:024 Midi NoteOn 1 E3 v:80 004:01:048 Midi NoteOn 1 G3 v:80 004:01:072 Midi NoteOn 1 G3 v:0 004:01:072 Midi NoteOn 1 C4 v:85 004:02:000 Midi NoteOn 1 C4 v:0 004:02:000 Midi NoteOn 1 E4 v:99 004:02:024 Midi NoteOn 1 E4 v:0 004:02:024 Midi NoteOn 1 G3 v:78 004:02:048 Midi NoteOn 1 G3 v:0 004:02:048 Midi NoteOn 1 C4 v:80 004:02:072 Midi NoteOn 1 C4 v:0 004:02:072 Midi NoteOn 1 E4 v:97 004:03:000 Midi NoteOn 1 E4 v:0 004:03:000 Midi NoteOn 1 E3 v:0 004:03:000 Midi NoteOn 1 C3 v:0 004:03:000 Midi NoteOn 1 C3 v:85 004:03:024 Midi NoteOn 1 E3 v:78 004:03:048 Midi NoteOn 1 G3 v:76 004:03:072 Midi NoteOn 1 G3 v:0 004:03:072 Midi NoteOn 1 C4 v:87 004:04:000 Midi NoteOn 1 C4 v:0 004:04:000 Midi NoteOn 1 E4 v:101 004:04:024 Midi NoteOn 1 E4 v:0 004:04:024 Midi NoteOn 1 G3 v:78 004:04:048 Midi NoteOn 1 G3 v:0 004:04:048 Midi NoteOn 1 C4 v:83 004:04:072 Midi NoteOn 1 C4 v:0 004:04:072 Midi NoteOn 1 E4 v:97 005:01:000 Midi NoteOn 1 E4 v:0 005:01:000 Midi NoteOn 1 E3 v:0 005:01:000 Midi NoteOn 1 C3 v:0
[Without file header] MTrk (2038 bytes) TRACK 1 Bar:Bt:Tck Event Type 001:01:000 Midi Ctrllr 1 7 (Volume; MSB) 80 001:01:000 Midi Ctrllr 1 1 (Modulation; MSB) 8 001:01:000 Midi Ctrllr 1 91 (Effects 1) 48 001:01:000 Midi Ctrllr 1 93 (Effects 3) 0 001:01:000 Midi Ctrllr 1 11 (Expression; MSB) 64 001:03:048 Midi Ctrllr 1 0 (Bank Select; MSB) 0 001:03:048 Midi Ctrllr 1 32 (Bank Select; LSB) 0 P 001:03:048 Midi ProgChg 1 74 001:03:048 Midi Ctrllr 1 10 (Pan; MSB) 56 001:03:048 Midi Ctrllr 1 91 (Effects 1) 61 002:03:048 Midi NoteOn 1 G3 v:80 002:04:000 Midi NoteOn 1 G3 v:0 002:04:000 Midi NoteOn 1 A3 v:85 002:04:048 Midi NoteOn 1 A3 v:0 002:04:048 Midi NoteOn 1 B3 v:83 003:01:000 Midi NoteOn 1 B3 v:0 003:01:000 Midi NoteOn 1 C4 v:83 003:01:072 Midi NoteOn 1 C4 v:0 003:01:072 Midi NoteOn 1 D4 v:85 003:01:084 Midi NoteOn 1 D4 v:0 003:01:084 Midi NoteOn 1 C4 v:85 003:02:000 Midi NoteOn 1 C4 v:0 003:02:000 Midi NoteOn 1 B3 v:83 003:02:048 Midi NoteOn 1 B3 v:0 003:02:048 Midi NoteOn 1 E4 v:78 003:03:000 Midi NoteOn 1 E4 v:0 003:03:000 Midi NoteOn 1 A3 v:88 003:03:048 Midi NoteOn 1 A3 v:0 003:03:048 Midi NoteOn 1 D4 v:75 003:04:024 Midi NoteOn 1 D4 v:0 003:04:024 Midi NoteOn 1 E4 v:83 003:04:048 Midi NoteOn 1 E4 v:0 003:04:048 Midi NoteOn 1 D4 v:85 003:04:072 Midi NoteOn 1 D4 v:0 003:04:072 Midi NoteOn 1 C4 v:83 004:01:000 Midi NoteOn 1 C4 v:0 004:01:000 Midi NoteOn 1 B3 v:85 004:01:024 Midi NoteOn 1 B3 v:0 004:01:024 Midi NoteOn 1 G3 v:78 004:01:048 Midi NoteOn 1 G3 v:0 004:01:048 Midi NoteOn 1 A3 v:85 004:01:072 Midi NoteOn 1 A3 v:0 004:01:072 Midi NoteOn 1 B3 v:78 004:02:000 Midi NoteOn 1 B3 v:0 004:02:000 Midi NoteOn 1 C4 v:83 004:02:024 Midi NoteOn 1 C4 v:0 004:02:024 Midi NoteOn 1 B3 v:83 004:02:048 Midi NoteOn 1 B3 v:0 004:02:048 Midi NoteOn 1 C4 v:83 004:02:072 Midi NoteOn 1 C4 v:0 004:02:072 Midi NoteOn 1 D4 v:75 004:03:000 Midi NoteOn 1 D4 v:0 004:03:000 Midi NoteOn 1 E4 v:88 004:03:024 Midi NoteOn 1 E4 v:0 004:03:024 Midi NoteOn 1 D4 v:80 004:03:048 Midi NoteOn 1 D4 v:0 004:03:048 Midi NoteOn 1 E4 v:80 004:03:072 Midi NoteOn 1 E4 v:0 004:03:072 Midi NoteOn 1 F#4 v:78 004:04:000 Midi NoteOn 1 F#4 v:0 004:04:000 Midi NoteOn 1 G4 v:90 004:04:048 Midi NoteOn 1 G4 v:0 004:04:048 Midi NoteOn 1 B3 v:83 005:01:000 Midi NoteOn 1 B3 v:0 005:01:000 Midi NoteOn 1 C4 v:85 005:01:048 Midi NoteOn 1 C4 v:0 005:01:048 Midi NoteOn 1 A3 v:83 005:02:000 Midi NoteOn 1 A3 v:0 005:02:000 Midi NoteOn 1 D4 v:83 005:02:024 Midi NoteOn 1 D4 v:0 005:02:024 Midi NoteOn 1 C4 v:78 005:02:048 Midi NoteOn 1 C4 v:0 005:02:048 Midi NoteOn 1 B3 v:83 005:02:072 Midi NoteOn 1 B3 v:0 005:02:072 Midi NoteOn 1 A3 v:83 005:03:000 Midi NoteOn 1 A3 v:0 005:03:000 Midi NoteOn 1 G3 v:85 005:03:072 Midi NoteOn 1 G3 v:0 005:03:072 Midi NoteOn 1 G3 v:78 005:04:000 Midi NoteOn 1 G3 v:0 005:04:000 Midi NoteOn 1 F3 v:83 005:04:024 Midi NoteOn 1 F3 v:0 005:04:024 Midi NoteOn 1 E3 v:80 005:04:048 Midi NoteOn 1 E3 v:0 005:04:048 Midi NoteOn 1 F3 v:78 005:04:072 Midi NoteOn 1 F3 v:0 005:04:072 Midi NoteOn 1 G3 v:83 006:01:000 Midi NoteOn 1 G3 v:0 006:01:000 Midi NoteOn 1 A3 v:83 006:01:024 Midi NoteOn 1 A3 v:0 006:01:024 Midi NoteOn 1 G3 v:80 006:01:048 Midi NoteOn 1 G3 v:0 006:01:048 Midi NoteOn 1 A3 v:78 006:01:072 Midi NoteOn 1 A3 v:0 006:01:072 Midi NoteOn 1 B3 v:78 006:02:000 Midi NoteOn 1 B3 v:0 006:02:000 Midi NoteOn 1 C4 v:88 006:04:000 Midi NoteOn 1 C4 v:0 006:04:000 Midi NoteOn 1 B3 v:83 007:01:000 Midi NoteOn 1 B3 v:0
If you don't have access to a RISC OS computer and Midiphile's facilities you may need to manipulate your data into a form similar to the above (perhaps with 'mid2txt_pl' and 'm2t2t_awk' also suplied here) prior to using the melodic and harmonic AWK scripts described below. But first, the next section covers a few utility scripts which you may or may not need to use.
Converts a MIDI file into textual description in either score list (default) or event list format.
Usage: [perl] mid2txt [-e -h] input-MIDI-file
Output is to STDOUT, which can be redirected to file, for example:
mid2txt -e input-MIDI-file > output-text-file
would send an event list to the output file specified. By default a score list is produced (note starttime + duration) this can be changed to an event list of note_on and note_off commands with the -e switch.
This script was originally written (2003, version 0.01) as a transportable replacement for the RISC OS dependent program Midiphile. It requires Perl 5.001 or above and the services of the MIDI-perl modules written by Sean Burke - freely available from CPAN. Once it has converted your MIDI file to text format, the analytical AWK scripts (below) can then be set to work on the data. The textual output from mid2txt is in a form determined by the MIDI-perl modules which while containing all the information needed, requires further processing by 'm2t2t_awk' to make it conform with Midiphile's list format used by the awk scripts.
However, now (2005, version 0.02) mid2txt also functions as the 'doorway' to a set of perl scripts which manipulate the script's text-list output to achieve a number of further ends.
Here is an exerpt from a mid2txt file:
MIDI file output in plain text: MThd (6 bytes) Format 1 (2 tracks) Metrical time (TickDiv = 96 ppqn) Track: 0 @notes = ( # 8 notes... ['key_signature', 0, 2, 0], ['time_signature', 0, 2, 2, 24, 8], ['set_tempo', 0, 631578], ['set_tempo', 4656, 638297], ['set_tempo', 192, 645161], ['set_tempo', 192, 689655], ['set_tempo', 192, 674157], ['set_tempo', 192, 705882], ); Track: 1 @notes = ( # 347 notes... ['control_change', 1536, 1, 0, 0], ['control_change', 0, 1, 32, 0], ['patch_change', 0, 1, 54], ['control_change', 0, 1, 10, 0], ['control_change', 0, 1, 91, 61], ['note_on', 0, 1, 69, 96], ['note_on', 48, 1, 69, 0], ['note_on', 0, 1, 74, 102], ['note_on', 96, 1, 74, 0], ['note_on', 0, 1, 73, 90], ['note_on', 96, 1, 73, 0], ['note_on', 0, 1, 74, 100], ['note_on', 48, 1, 74, 0], ['note_on', 0, 1, 73, 94], ['note_on', 24, 1, 73, 0], ['note_on', 0, 1, 71, 100], ['note_on', 24, 1, 71, 0], ['note_on', 0, 1, 69, 96], ['note_on', 48, 1, 69, 0], ['note_on', 0, 1, 71, 102], ['note_on', 48, 1, 71, 0], ['note_on', 0, 1, 69, 102], ['note_on', 24, 1, 69, 0], ['note_on', 0, 1, 74, 94], ['note_on', 24, 1, 74, 0],
This script moulds the text output of mid2txt into the same format used by Midifile so that the data can be further processed by the harmonic and melodic awk scripts detailed below. Output is to 'm2t2t_txt' in the CWD.
Usage: [g]awk [-v ticksInBar=n -v ticksInBeat=n -v ppqn=n -v readTimeSig=n] -f m2t2t_awk mid2txt_txt
Default 4/4 time, 96 ppqn.
If 'm2t2t_awk' finds time signature and tick division data in the input file it will use this information to override commandline arguments and its own internal defaults. So for the most part no variable values need be given - 'm2t2t_awk finds the info for itself.
At present the script will only apply a single time signature - the first, at position tick 0 - but does register and printout others found. However, if the variable 'readTimeSig' is set to zero then the script uses the commandline or default time implied by the values of 'ticksInBar' and 'ticksInBeat'. This feature could be used to re-bar / change time signature. Also, the script expects to find only one track with note event data, that is either a Type 0 MIDI file or a Type 1 file with a Track 0 tempo map and Track 1 note events.
The output file 'm2t2t_txt' is essentially the same as the output from Midiphile and can be used with 'mel01_awk' and 'har01_awk'.
Usage: [g]awk -f sibfix_awk inputfile
The RISC OS versions of Sibelius, Sibelius6 and Sibelius7 have a 'feature' that can produce two NoteOns followed by two NoteOffs where notes are exchanged between voices on the same track. Midifile marks these up with a 'N' and 'O' flags. Where these occur in Midifile's text output 'sibfix_awk' will correct the text file - it is just the sequence of commands which is at fault. If the fault is not corrected there is likely to be difficulties with using subsquent scripts on the file. Where 'sibfix_awk' has moved a line into the correct order the use of single spaces between fields marks it out. The awk scripts will function on these lines just the same as the untouched ones.
Where scripts require option(s) to be set, this can be done from the command line (in the form: -v var=value) or by changing the values at the top of the BEGIN section within the script. The defaults are the values written in to the top of the BEGIN section, eg.
# Enter pattern on the command line, for example:
# gawk -v pattern="+2+2+1+2-2-1+5-7" -f mel05_awk mel02_txt
# NOTE: no spaces around the equals sign above...
# OR
# Enter pattern below eg. pattern = "+2+2+1+2-2-1+5-7" (spaces OK).
BEGIN {(! pattern) { pattern = "+2+2+1+2-2-1+5-7" }
#------------DON'T ALTER BELOW THIS LINE---------------#
Usage: [g]awk [-v var=value] -f mel01_awk inputfile
This script takes input in the form of Midiphile's text analysis of a MIDI file, filters out unwanted data... and calculates the number of ticks for the start and end of each note event - which is appended to the input data and output as file 'mel01_txt'. So a snatch of the Fugue now looks like:MIDI data output by 'mel01_awk' Text format suitable for 'mel02_awk'. 002:03:048:768 Midi NoteOn 1 G3 v:80 002:04:000:840 Midi NoteOn 1 G3 v:0 002:04:000:840 Midi NoteOn 1 A3 v:80 002:04:048:888 Midi NoteOn 1 A3 v:0 002:04:048:888 Midi NoteOn 1 B3 v:80 003:01:000:960 Midi NoteOn 1 B3 v:0 003:01:000:960 Midi NoteOn 1 C4 v:80 003:01:072:1032 Midi NoteOn 1 C4 v:0 003:01:072:1032 Midi NoteOn 1 D4 v:80 003:01:084:1044 Midi NoteOn 1 D4 v:0 003:01:084:1044 Midi NoteOn 1 C4 v:80 003:02:000:1080 Midi NoteOn 1 C4 v:0 003:02:000:1080 Midi NoteOn 1 B3 v:80 003:02:048:1128 Midi NoteOn 1 B3 v:0 003:02:048:1128 Midi NoteOn 1 E4 v:80 003:03:000:1200 Midi NoteOn 1 E4 v:0 003:03:000:1200 Midi NoteOn 1 A3 v:80 003:03:048:1248 Midi NoteOn 1 A3 v:0 003:03:048:1248 Midi NoteOn 1 D4 v:80 003:04:024:1344 Midi NoteOn 1 D4 v:0 003:04:024:1344 Midi NoteOn 1 E4 v:80 003:04:048:1368 Midi NoteOn 1 E4 v:0 003:04:048:1368 Midi NoteOn 1 D4 v:80 003:04:072:1392 Midi NoteOn 1 D4 v:0 003:04:072:1392 Midi NoteOn 1 C4 v:80 004:01:000:1440 Midi NoteOn 1 C4 v:0 Time Signature(s) found at: 001:01:000 4/4
The script does allow you to specify how many ticks in a bar and a beat in the BEGIN section or on as command line options. However, if 'mel01_awk' finds time signature and tick division data in the input file it will use this information to override commandline arguments and its own internal defaults. So for the most part no variable values need be given - 'mel01_awk' finds the info for itself. At present the script will only apply a single time signature - the first, at position tick 0 - but does register and printout others found. If the variable 'readTimeSig' is set to zero then the script uses the commandline or default time signature implied by the values of 'ticksInBar' and 'ticksInBeat'. This feature could be used to re-bar / change time signature. The defaults are:
BEGIN {
if (! ticksInBar) { ticksInBar = 384 }
if (! ticksInBeat) { ticksInBeat = 96 }
if (! ppqn) { ppqn = 96 }
if (! readTimeSig) { readTimeSig = 1 }
#Default: 96 ticks/quarternote, 4/4 time sig.
(OR on the command line)
gawk -v ticksInBar=384 -v ticksInBeat=96 -v ppqn=96 -f mel01_awk inputfile
Usage: [g]awk -f mel02_awk mel01_txt
This script takes 'mel01_txt' as input and calculates the intervals between each note in semitones, expressing rising intervals as '+x' and falling intervals as '-x'. It assumes a starting pitch of middle C for the first interval and expresses rests as '-0' and repeated note intervals as '+0'. Two notes separated by a rest (eg. B, rest, C) will have their interval spread across the rest (eg. B, -0, +1). The above snatch (plus a few more notes) of data is converted to:
001:01:000:0 -0 REST 768 002:03:048:768 +7 G3 72 002:04:000:840 +2 A3 48 002:04:048:888 +2 B3 72 003:01:000:960 +1 C4 72 003:01:072:1032 +2 D4 12 003:01:084:1044 -2 C4 36 003:02:000:1080 -1 B3 48 003:02:048:1128 +5 E4 72 003:03:000:1200 -7 A3 48 003:03:048:1248 +5 D4 96 003:04:024:1344 +2 E4 24 003:04:048:1368 -2 D4 24 003:04:072:1392 -2 C4 48 004:01:000:1440 -1 B3 24 004:01:024:1464 -4 G3 24 004:01:048:1488 +2 A3 24 004:01:072:1512 +2 B3 48 004:02:000:1560 +1 C4 24 004:02:024:1584 -1 B3 24 004:02:048:1608 +1 C4 24 004:02:072:1632 +2 D4 48 004:03:000:1680 +2 E4 24 004:03:024:1704 -2 D4 24 004:03:048:1728 +2 E4 24 004:03:072:1752 +2 F#4 48 004:04:000:1800 +1 G4 48 004:04:048:1848 -8 B3 72 005:01:000:1920 +1 C4 48Displaying melodic data as a succession of plus and minus intervals as above provides a convenient format for further analysis. Identifiable themes and motives (eg. +2+2+1+2-2-1+5-7) which remain constant through transpositions and modulations can be searched for, perhaps with the addition of 'wildcards' (regular expressions) to find variations and transformations. This analysis can also take account of durations if required by using the 'tick' information in the fourth column from the left.
Also the data can be thoroughly sifted to produce a profile of all melodic combinations - which is precisely what the next script does.
Usage: [g]awk [-v var=value] -f mel03_awk mel02_txt
This script takes as input 'mel02_txt' and catalogues every melodic pattern present in the file... or patterns up to a configurable pattern length.
If all combinations are being catalogued this can require substantial memory and cpu resources. Therefore the default pattern length is set at 12. For example, a melody of 250 notes can produce an output file of 5MB if all combinations are printed out. To change the maximum length of patterns searched for, set the 'patternlength' variable to the required number... or to the string "no_limit" to search every single pattern ranging from the whole melody down.
By default only patterns occurring two or more times are output (to 'mel03_txt'). Change the variable 'occurrences = 2' to '1' to see all combination of notes in the output file.
# Change 'occurrences = x' to screen out occurrences below 'x'.
# To limit 'patternlength' give it the desired length or assign
# the string "no_limit" -i.e. all combinations are catalogued,
# which may take some time. Default is 12 notes.
BEGIN { if (! occurrences) { occurrences = 2 }
if (! patternlength) { patternlength = 12 }
}
#-----DO NOT ALTER BELOW THIS LINE-----#
(OR on the command line)
gawk -v occurrences=2 -v patternlength=12 -f mel03_awk mel02_txt
Here are the first few lines of 'mel03_txt' when applied to the first voice of the fugue with the above settings:
5 of +2+2+1+2-2-1 3 of -1+5-7+5+2-2-2-1 5 of +5+2 5 of +5+2-2 3 of +2-2-1+5-7+5+2-2-2 9 of +1+2+2 2 of -5+2+2 5 of -2-1-2 5 of -2+2+2 3 of -2-1+5-7+5+2-2-2-1 3 of +1+2-2-1+5-7+5+2-2-2-1 9 of +2+1+2
Usage: [g]awk [-v var=value] -f mel04_awk mel02_txt
This little script takes as input 'mel02_txt' and calculates the number and type of intervals - in semitones. Leaps greater than an octave are reduced to within an octave compass by default. Set -v compressto8ve="no" on the command line or alter the variable in the BEGIN section below to change this behaviour.
BEGIN { if (! compressto8ve) { compressto8ve = "yes" }
}
#-------------DO NOT ALTER BELOW THIS LINE------------#
The script also accumulates all plus & minus intervals in to a total. Running the script on the first voice of the fugue produces the file 'mel04_txt' with the following output:
File: mel04_txt, interval data. Number of +0 semitone intervals: 14 Number of +1 semitone intervals: 33 Number of +2 semitone intervals: 59 Number of +3 semitone intervals: 1 Number of -1 semitone intervals: 35 Number of +4 semitone intervals: 1 Number of -2 semitone intervals: 54 Number of -3 semitone intervals: 4 Number of +5 semitone intervals: 15 Number of -4 semitone intervals: 1 Number of -5 semitone intervals: 2 Number of +7 semitone intervals: 2 Number of -6 semitone intervals: 1 Number of +8 semitone intervals: 2 Number of -7 semitone intervals: 7 Number of -8 semitone intervals: 1 Accumulated plus & minus interval = 31
Usage: [g]awk [-v var=value] -f mel05_awk mel02_txt
Takes mel02.txt as input and searches the file for a given (literal) melodic pattern. This script does NOT support regular expressions (use mel06.awk which does). The output file is 'mel05.txt' in current working directory.
# Enter pattern on the command line, for example:
# gawk -v pattern="+2+2+1+2-2-1+5-7" -f mel05.awk mel02.txt
# NOTE: no spaces around the equals sign above...
# OR
# Enter pattern below eg. pattern = "+2+2+1+2-2-1+5-7" (spaces OK).
BEGIN { if (! pattern) { pattern = "+2+2+1+2-2-1+5-7" }
#------------DON'T ALTER BELOW THIS LINE---------------#
Running mel05_awk on the mel02_txt which contains data from the first voice of the fugue, returns:
File: mel05.txt (output by script mel05.awk) Match found, starting at: 002:04:000 +2 A3 48 672 Pattern found: +2+2+1+2-2-1+5-7 Match found, starting at: 007:02:000 +2 D4 48 2400 Pattern found: +2+2+1+2-2-1+5-7 Match found, starting at: 016:03:000 +2 D4 48 5952 Pattern found: +2+2+1+2-2-1+5-7 Match found, starting at: 021:01:000 +2 A3 48 7680 Pattern found: +2+2+1+2-2-1+5-7
Usage: [g]awk [--re-interval] [-v var=value] -f mel06_awk mel02_txt
Takes mel02.txt as input and searches the file for a range of melodic patterns described by a regular expression, written as a string contained in the variable 'pattern'. The output file is 'mel05.txt' in current working directory. To use this script some knowledge of regexs is needed, the gawk manual is the place to look for guidance.
# Takes mel02.txt as input.
# Outputs results to file 'mel06.txt' in current working directory.
# The input file is searched for the given melodic pattern like
# 'mel05.awk', however 'pattern' uses regular expressions rather than
# a literal string. This provides a more powerful and refined search
# facility but requires some knowledge of regexs. (Don't forget '\\+'
# and '\\-' for plus & minus in a string but not in the char class.
# or --re-interval for interval expressions)
# Enter pattern on the command line, for example:
# gawk -v pattern="[-+12]{0 ,12}\\+5\\-7" -f mel06.awk mel02.txt
# NOTE: no spaces around the equals sign above...
# OR... Enter pattern below:
# Eg. pattern = "[-+12]{0 ,12}\\+5\\-7" (spaces OK).
BEGIN {(! pattern) {= "[-+12]{0 ,12}\\+5\\-7" }
#------------------DON'T ALTER BELOW THIS LINE---------------------#
Running mel06_awk on the mel02_txt which contains data from the first voice of the fugue, returns one extra match:
File: mel06.txt (output by script mel06.awk) Match found, starting at: 002:04:000 +2 A3 48 672 Pattern found: +2+2+1+2-2-1+5-7 Match found, starting at: 007:02:000 +2 D4 48 2400 Pattern found: +2+2+1+2-2-1+5-7 Match found, starting at: 016:03:000 +2 D4 48 5952 Pattern found: +2+2+1+2-2-1+5-7 Match found, starting at: 017:03:024 +2 A4 24 6360 Pattern found: +2-2-2-1+5-7 Match found, starting at: 021:01:000 +2 A3 48 7680 Pattern found: +2+2+1+2-2-1+5-7
Scripts dealing with harmonic data are named 'harxx_awk' (where xx is a number) and there output will appear in the Current Working Directory with the name 'harxx_txt'. A similar scheme is used for melodic data -i.e. 'mel01_awk'.
Usage: [g]awk [-v var=value] -f har01[p]_awk inputfile
This script processes MIDI data taken from RISC OS Midiphile's text output - to prepare it for harmonic analysis by the script 'har02_awk'. Input takes the form:
002:01:000 Midi NoteOn 2 C3 v:80
It may help to look at the way this script sifts its input data - which is a (text) list of note-on and note-off events, in time order, presented by Midiphile's output. 'har01_awk' reads these events and constructs a running list of 'notes currently sounding'. At a regular interval set by the 'pollincrement' variable this running list (of MIDI note numbers) is written out to file. The pollincrement is set in MIDI ticks - eg. at 96 pulses per quarter note, a pollincrement of 48 would mean polling every quaver and 192 at minim intervals.
By using the 'har01p_awk variant of the script the running list of 'notes currently sounding' gains the facility of a piano's sustaining pedal for the length of the pollincrement. This is used for 'scooping up' arpeggi, broken or spread chords. For example, Bach's Prelude No.1 requires a pollincrement of 192 used with the 'har01p_awk' script to yield useful information. To some extent which variant of the script and the pollincrement variable are interdependent and each should be set with an eye to the other... and most importantly, with an eye to the texture of the score.
Variables which can be set at the top of the BEGIN section or from the command line are:
002:03:000:576 60 64 67 72 76 003:01:000:768 60 64 67 72 76 003:03:000:960 60 62 69 74 77 004:01:000:1152 60 62 69 74 77 004:03:000:1344 59 62 67 74 77 005:01:000:1536 59 62 67 74 77 005:03:000:1728 60 64 67 72 76 006:01:000:1920 60 64 67 72 76The above data represents an intermediate stage requiring further processing by 'har02_awk'. A batch file could be used to automate the working of 'har01_awk' and 'har02_awk'.
The data in 'har01_txt' above, reading the MIDI note numbers from left to right is just another way of writing the chords - Cmaj, Dmin (or Fmaj), Gmaj, Cmaj - or - I, II(iii)/IV(ii), V(i), I. This is precisely what the script 'har02_awk' attempts to do.
However, given that the ambiguity in the second bar is only the tip of the iceberg of uncertainty in regard to harmonic analysis. (Is it D minor over a pedal C or F major second inversion with an added sixth?) The script works by seeking out every possible harmonic interpretation of each line of note numbers. Some lines, like the first two are clear cut - C major and some like the second two have more than one interpretation... Which is the right interpretation I leave up to the user. :-) Or even, is there a right interpretation?
Running 'har02_awk' over the output file 'har01_txt', produces the following:
002:03:000:576 60 64 67 72 76 :C Major 003:01:000:768 60 64 67 72 76 :C Major 003:03:000:960 60 62 69 74 77 :F Major(ii) +maj6 :D minor(iii) +min7 004:01:000:1152 60 62 69 74 77 :F Major(ii) +maj6 :D minor(iii) +min7 004:03:000:1344 59 62 67 74 77 :G Major(i) +min7 005:01:000:1536 59 62 67 74 77 :G Major(i) +min7 005:03:000:1728 60 64 67 72 76 :C Major 006:01:000:1920 60 64 67 72 76 :C Major 006:03:000:2112 60 64 69 76 81 :A minor(i) 007:01:000:2304 60 64 69 76 81 :A minor(i) 007:03:000:2496 60 62 66 69 74 :D Major(iii) +min7 008:01:000:2688 60 62 66 69 74 :D Major(iii) +min7 008:03:000:2880 59 62 67 74 79 :G Major(i) 009:01:000:3072 59 62 67 74 79 :G Major(i) 009:03:000:3264 59 60 64 67 72 :E minor(ii) +min6 :C Major(iii) +maj7 010:01:000:3456 59 60 64 67 72 :E minor(ii) +min6 :C Major(iii) +maj7 010:03:000:3648 57 60 64 67 72 :A minor +min7 :C Major +maj6 011:01:000:3840 57 60 64 67 72 :A minor +min7 :C Major +maj6 011:03:000:4032 50 57 62 66 72 :D Major +min7 012:01:000:4224 50 57 62 66 72 :D Major +min7 012:03:000:4416 55 59 62 67 71 :G Major 013:01:000:4608 55 59 62 67 71 :G Major 013:03:000:4800 55 58 64 67 73 :G dim 014:01:000:4992 55 58 64 67 73 :G dim 014:03:000:5184 53 57 62 69 74 :D minor(i) 015:01:000:5376 53 57 62 69 74 :D minor(i) 015:03:000:5568 53 56 62 65 71 :F dim 016:01:000:5760 53 56 62 65 71 :F dim 016:03:000:5952 52 55 60 67 72 :C Major(i) 017:01:000:6144 52 55 60 67 72 :C Major(i)
Having arrived at this stage with all (or most) of the harmonic progressions of the piece visible, it is now possible to search and sift the data for chord patterns or sequences.
A script to search for chord patterns or sequences and modulations of such patterns. Which I have yet to write :-(
A script which can determine and trace the tonal centre (keys/modulations) of a piece - that is, rather than list every possible chord as in 'har02_awk', a script that will express an opinion as to the sequence of harmonies that forms the underlying structure. Which I have yet to write :-(
This script takes input from 'har01_txt' files which it searches for patterns of notes which could be used to write a canon to fit the harmonic sequence represented by the input file.
Usage [g]awk [-v pitch=n -v time=u] -f har05_awk har01_txt
Output from har05_awk Pitch interval between voices is 7 semitones Time lapse between voices is 2 units 001:01:000:0 CEG 001:03:000:192 CEG 002:01:000:384 DCA 002:03:000:576 DCA 003:01:000:768 DBG 003:03:000:960 DBG 004:01:000:1152 EG 004:03:000:1344 EG 005:01:000:1536 EA 005:03:000:1728 EA 006:01:000:1920 ADF# 006:03:000:2112 ADF# 007:01:000:2304 DGB 007:03:000:2496 DGB 008:01:000:2688 GBE 008:03:000:2880 GBEThe notes listed in 'har05_txt' above are the ones which can be used in the first voice at the Bar:Beat:Tick position listed which when transposed by 'pitch' and 'time' will form the second voice and fit the harmonies of their position. Output to 'har05_txt' is appended to file so that a number of different pitches and times can be tried out to find the best one... and for comparision.
And here is a canon written with the help of this script on the Prelude No.1 harmonies. Canon in MIDI file format. This (feeble) attempt at a canon, is included for illustrative purposes only!
Click here to download the above music analysis scripts and a copy of this page: awk scripts in a ZIP archive (20Kb).
Click here to download the awk HTML sorting script, entity mapping file and brief instructions.