Awk – Print Problem Solutions

awk

I have a code that takes values from a log file and populates them into an excel spreadsheet:

function ltrim(s) { sub(/^[ \t]+/, "", s); return s }
function rtrim(s) { sub(/[ \t]+$/, "", s); return s }
function trim(s)  { return rtrim(ltrim(s)); }

function getfield( xml, fieldname) {
    i = index ( xml, "fieldName=\""fieldname"\"");
    if(i < 1){
      ts = "";
    }
    else{
       ts = substr( xml, i + length("fieldName=\""fieldname"\"") );
       ts = substr( ts, index( ts, "value=\"" ) + length( "value=\"" ) );
       ts = substr( ts, 1, index( ts, "\"")-1);
       gsub(/&amp;/, "&", ts );
       if ( index( ts, "," ) > 0 ) ts ="\""ts"\"";
    }
            return ts;
        }
BEGIN   { FS = "--------";OFS=",";}
 {sub(/..,[A-Z][A-Z]/, "CT-PT")}

        {
                orig = $0;
               # $7 = getfield(orig, "Modalities In Study");
                $9 = getfield(orig,"Subject Patients Sex");
                $10 = getfield(orig,"Modality Body Region");
                $11 = getfield(orig, "Patients Birth Date");
                $12 = getfield(orig, "Protocol Name");
                $13 = getfield(orig, "Timepoint ID");
                print;
         }

Everything is working great except for one thing. If I place this print statement anywhere in the code, it loops in the whole Excel spreadsheet. So basically the print statement appears in every other row. I only want it to appear in the first row:

{print "Study Instance UID,Number of Series,Number of Instances, Exam Transfer Date,Exam Date,Subject Number,Modalities In Study,Upload Status,Subject Patients Sex,Modality Body Region,Patients Birth Date,Protocol Name,Timepoint"}

I am not sure why this occurs. I even tried placing this line right after function trim(s) { return rtrim(ltrim(s)); } but still keeps looping in the spreadsheet. Does anyone have any suggestions or might know why this occurs?

Best Answer

Aside from functions, awk code looks like

condition {actions}

If the {actions} block is missing, the implicit action if the condition returns true is {print}.

If the condition is missing, and this directly relates to your question, it is implicitly true for every line of the input. That's why you get that line printed so many times.

You need to specify a condition. Either put the print statement inside the BEGIN block as Stéphane suggests, or you can do this:

NR == 1 {print "this is the header"}

That action only occurs when the condition is true, which is for the first line of input.

Related Question