Print line only if the upper line include specific word

awkperlsedtext processing

We have the following file with hostnames and host ip's ( long file with 90-100 machines per linux machine )

hosts.cluster.conf

  "href" : "http://localhost:8080/api/v1/hosts/worker02.sys87.com",
  "Hosts" : 
    "cluster_name" : "hdp",
    "host_name" : "worker02.sys87.com",
    "ip" : "23.67.32.65"


  "href" : "http://localhost:8080/api/v1/hosts/worker03.sys87.com",
  "Hosts" : 
    "cluster_name" : "hdp",
    "host_name" : "worker03.sys87.com",
    "ip" : "23.67.32.66"


  "href" : "http://localhost:8080/api/v1/hosts/worker04.sys87.com",
  "Hosts" : 
    "host_name" : "worker04.sys87.com",
    "ip" : "23.67.32.67"


  "href" : "http://localhost:8080/api/v1/hosts/worker05.sys87.com",
  "Hosts" : 
    "cluster_name" : "hdp",
    "host_name" : "worker05.sys87.com",
    "ip" : "23.67.32.68"

we want to print all host_name lines only if the upper line before include the "cluster_name" word

expected results

"host_name" : "worker02.sys87.com",

"host_name" : "worker03.sys87.com",

"host_name" : "worker05.sys87.com",

Best Answer

Short awk solution:

awk '/cluster_name/{ cl=NR }/host_name/ && NR-1==cl' hosts.cluster.conf
  • /cluster_name/{ cl=NR } - capturing the record number of "cluster_name" line
  • /host_name/ - on encountering "host_name" line
  • NR-1==cl - ensuring that the current "host_name" record number NR is next after "cluster_name" record number (presented by cl)

The output:

"host_name" : "worker02.sys87.com",
"host_name" : "worker03.sys87.com",
"host_name" : "worker05.sys87.com",

In case if host_name appears as the 1st line, though I doubt about that in real case, use the following version:

awk '/cluster_name/{ cl=NR }/host_name/ && cl && NR-1==cl' hosts.cluster.conf
Related Question