Ubuntu – How to delete everything after second occurrence of quotes using the command line

bashcommand linescriptstext processing

I have this store in a variable

   sCellEventTrap-03-28 TRAP-TYPE  -- CAC Code: 00
        ENTERPRISE compaq
        VARIABLES  { scellNameDateTime,
                     scellSWComponent,
                     scellECode,
                     scellCAC,
                     scellEIP}
        DESCRIPTION
             "Severity: Normal -- informational in nature. A physical disk drive has experienced an ID block inconsistency during a periodic drive check."
           --#TYPE      "StorageCell Event"
           --#SUMMARY   "SCellName-TimeDate %s : SWCID %d : ECode: %d : CAC %d : EIP %d."
           --#ARGUMENTS {0,1,2,3,4,}
           --#SEVERITY  INFORMATIONAL
           --#TIMEINDEX 136
           --#STATE     WARNING
        ::= 13600808

I am to cut everything till second occurrence of ". So that would give me:

 sCellEventTrap-03-28 TRAP-TYPE  -- CAC Code: 00
        ENTERPRISE compaq
        VARIABLES  { scellNameDateTime,
                     scellSWComponent,
                     scellECode,
                     scellCAC,
                     scellEIP}
        DESCRIPTION
             "Severity: Normal -- informational in nature. A physical disk drive has experienced an ID block inconsistency during a periodic drive check."

Another Example

    genericSanEvent TRAP-TYPE
        ENTERPRISE hpSanManager
        VARIABLES  { severityLevel, category, id,
                     msgString, contactName, contactEmail,
                     contactWorkPhone, contactHomePhone, 
                     contactPager, contactFax }
        DESCRIPTION
                        "A generic SAN event has occurred.  The variables are:
                            severityLevel - the event severity level;
                            category - Category of the event being reported;
                            code - ID of the event in the given category;
                            msgString - the message string describing
                                the event;
                            contactName - the name of the individual
                                to be notified of the event;
                            contactEmail - the e-mail address of the
                                individual referred to in contactName;
                            contactWorkPhone - the work phone number
                                of the individual referred to in 
                                contactName;
                            contactHomePhone - the home phone number
                                of the individual referred to in 
                                contactName;
                            contactPager - the pager number of the 
                                individual referred to in contactName;
                            contactFax - the FAX number of the individual
                                 referred to in contactName"
     -- The following are attributes used by xnmloadmib for improved formatting
     --#TYPE "OV SAM SAN Event"
     --#SUMMARY "OV SAM SAN Event, Category/Id: %d/%d, Msg: %d  Severity: %d  Contact: %d"
     --#ARGUMENTS {1,2,3,0,4}
     --#SEVERITY CRITICAL
     --#GENERIC 6
     --#CATEGORY "Application Alert Events"
     --#SOURCE_ID "T"
        ::= 1

Output for this example should be :

    genericSanEvent TRAP-TYPE
        ENTERPRISE hpSanManager
        VARIABLES  { severityLevel, category, id,
                     msgString, contactName, contactEmail,
                     contactWorkPhone, contactHomePhone, 
                     contactPager, contactFax }
        DESCRIPTION
                        "A generic SAN event has occurred.  The variables are:
                            severityLevel - the event severity level;
                            category - Category of the event being reported;
                            code - ID of the event in the given category;
                            msgString - the message string describing
                                the event;
                            contactName - the name of the individual
                                to be notified of the event;
                            contactEmail - the e-mail address of the
                                individual referred to in contactName;
                            contactWorkPhone - the work phone number
                                of the individual referred to in 
                                contactName;
                            contactHomePhone - the home phone number
                                of the individual referred to in 
                                contactName;
                            contactPager - the pager number of the 
                                individual referred to in contactName;
                            contactFax - the FAX number of the individual
                                 referred to in contactName"

Best Answer

Using awk:

awk -v RS='"' -v ORS='"' 'NR==1{print} NR==2{print; printf"\n";exit}' file

This sets the record separator to ". So, we want to print the first two records and then we are done. In more detail:

  • -v RS='"'

    This sets the input record separator to a double quote.

  • -v ORS='"'

    This sets the out record separator to a double quote.

  • NR==1{print}

    This tells awk to print the first line.

  • NR==2{print; printf"\n";exit}

    This tells awk to print the second line, then print a newline character, and then exit.

Using sed

sed -r 'H;1h;$!d;x; s/(([^"]*"){2}).*/\1/' file

This reads the whole file in at once. So, if the file is huge, don't use this approach. It works as follows:

  • H;1h;$!d;x

    This is a useful sed idiom: it reads the whole file in at once.

  • s/(([^"]*"){2}).*/\1/

    This looks for the second " and then deletes all text which follows the second quote.

    The regex (([^"]*"){2}) captures all text up to and including the second double quote and saves it in group 1. The regex .* captures everything that follows to the end of the file. The replacement text is group 1, \1.

Related Question