Text Processing Awk Indentation – How to Use Awk to Indent a Source File Based on Simple Rules

awkindentationtext processing

How can I indent source code based on a couple of simple rules?

As an example, I've used sed and ask to transform a selenium HTML source table to the following rspec like code. How could I consistently indent lines between describe and end ? Ideally I would like to be able to add indenting to

describe "Landing" do
visit("http://some_url/url_reset")
visit("http://some_url/url_3_step_minimal_foundation")
# comments
expect(css_vehicle1_auto_year) to be_visible
end
describe "Stage1" do
wait_for_element_present(css_vehicle1_auto_year option_auto_year)
select(auto_year, from: css_vehicle1_auto_year)
...
end
describe "Stage2" do
fill_in(css_driver1_first_name, with: driver1_first_name)
fill_in(css_driver1_last_name, with: driver1_last_name)
...
submit(css_policy_form)
expect(css_vehicle1_coverage_type) to be_visible
end
describe "Stage3" do
wait_for_element_present(css_vehicle1_coverage_type)
select(coverage_type, from: css_vehicle1_coverage_type)
find(css_has_auto_insurance).click
...
submit(css_policy_form)
expect(css_quotes) to be_visible
end

so I have

describe "Landing" do
  visit("http://some_url/url_reset")
  visit("http://some_url/url_3_step_minimal_foundation")
  # comments
  expect(css_vehicle1_auto_year) to be_visible
end
describe "Stage1" do
  wait_for_element_present(css_vehicle1_auto_year option_auto_year)
  select(auto_year, from: css_vehicle1_auto_year)
  ...
end
describe "Stage2" do
  fill_in(css_driver1_first_name, with: driver1_first_name)
  fill_in(css_driver1_last_name, with: driver1_last_name)
  ...
  submit(css_policy_form)
  expect(css_vehicle1_coverage_type) to be_visible
end
describe "Stage3" do
  wait_for_element_present(css_vehicle1_coverage_type)
  select(coverage_type, from: css_vehicle1_coverage_type)
  find(css_has_auto_insurance).click
  ...
  submit(css_policy_form)
  expect(css_quotes) to be_visible
end

The source code for the existing sed and awk's is at https://jsfiddle.net/4gbj5mh4/ but it's really messy and not what I am asking about. I've got the hang of simple sed and awk's but not sure where to start with this one.

It would be great if it could also handle recursion. Not essential for me but the generalization is probably useful to others using this question, i.e.

describe "a" do
describe "b" do
stuff
more stuff
end
end

to

describe "a" do
  describe "b" do
    stuff
    more stuff
  end
end

btw I am also doing this custom conversion partly because I've used variables as page objects in selenium and they bork the built-in export to rspec.

Best Answer

With awk:

awk '
  /^end/ { sub("  ", "", indent) } # Or { indent = substr(indent, 3) }
  { print indent, $0 }
  /^describe/ { indent = indent"  " }
' <file
Related Question