C Shell – Condition for If String Contains a Newline Character

cshstring

I'm not sure how to express this using the csh string matching syntax. I want to test whether a csh variable contains a newline. I'm basically looking for:

if ($mystr !~ <pattern for strings which contain a newline character>)

Edit: in my particular case, I am trying to make a string like this pass:

1234ABC

And a string like this fail:

1234ABC
 -------
FOOBAR

These are the output of a sed command, namely sed '1d;$d'. Not sure if that matters.

The reason why I am trying to detect newlines rather than " -------" is for defense against changes in the formatting of the file I'm parsing. (Anyway, I don't think it matters what I'm doing with the file exactly, since I'm just looking for a general solution for detecting a newline character.)

Best Answer

if ($mystr:q =~ *'\
'*) echo yes

should work in some implementations and versions of csh (like the csh and tcsh ones found on Debian). In some others (like the one found on Solaris 10), you may have better luck with

set nl = '\
'
if ($mystr:q =~ *$nl:q*) echo yes

Most people have given up trying to write reliable scripts with csh by now. Why would you use csh in this century?

This code works for me (outputs no) in tcsh 6.17.00 (Astron) 2009-07-10 (x86_64-unknown-linux) options wide,nls,dl,al,kan,rh,color,filec

set mystr = '1234ABC\
 -------\
FOOBAR'
if ($mystr:q !~ *'\
'*) then
  echo yes
else
  echo no
endif

Note that if you do:

set var = `some command`

csh stores each word (blank separated) of the output of some command in several elements of the var array.

With:

set var = "`some command`"

it stores each non-empty line in elements of the array.

It looks like one cannot¹ store the output of a command whole into a variable in (t)csh, so your only option would be:

set var = "`some command`" # note that it removes the empty lines
if ($#var == 1)...

¹ Strictly speaking, that's not true, one could do something like:

set x = "`some command | paste -d. /dev/null -`"
set var = ""
set nl = '\
'

foreach i ($x:q)
  set i = $i:s/.//:q
  set var = $var:q$i:q$nl:q
end

^{(of course, it may not work in all csh implementations/versions)}

Related Solutions

Bash – Replace multiple strings in a single pass

OK, a general solution. The following bash function requires 2k arguments; each pair consists of a placeholder and a replacement. It's up to you to quote the strings appropriately to pass them into the function. If the number of arguments is odd, an implicit empty argument will be added, which will effectively delete occurrences of the last placeholder.

Neither placeholders nor replacements may contain NUL characters, but you may use standard C \-escapes such as \0 if you need NULs (and consequently you are required to write \\ if you want a \).

It requires the standard build tools which should be present on a posix-like system (lex and cc).

replaceholder() {
  local dir=$(mktemp -d)
  ( cd "$dir"
    { printf %s\\n "%option 8bit noyywrap nounput" "%%"
      printf '"%s" {fputs("%s", yyout);}\n' "${@//\"/\\\"}"
      printf %s\\n "%%" "int main(int argc, char** argv) { return yylex(); }"
    } | lex && cc lex.yy.c
  ) && "$dir"/a.out
  rm -fR "$dir"
}

We assume that \ is already escaped if necessary in the arguments but we need to escape double quotes, if present. That's what the second argument to the second printf does. Since the lex default action is ECHO, we don't need to worry about it.

Example run (with timings for the skeptical; it's just a cheap-o commodity laptop):

$ time echo AB | replaceholder A B B A
BA

real    0m0.128s
user    0m0.106s
sys     0m0.042s
$ time printf %s\\n AB{0000..9999} | replaceholder A B B A > /dev/null

real    0m0.118s
user    0m0.117s
sys     0m0.043s

For larger inputs it might be useful to provide an optimization flag to cc, and for current Posix compatibility, it would be better to use c99. An even more ambitious implementation might try to cache the generated executables instead of generating them each time, but they're not exactly expensive to generate.

Edit

If you have tcc, you can avoid the hassle of creating a temporary directory, and enjoy the faster compile time which will help on normal sized inputs:

treplaceholder () { 
  tcc -run <(
  {
    printf %s\\n "%option 8bit noyywrap nounput" "%%"
    printf '"%s" {fputs("%s", yyout);}\n' "${@//\"/\\\"}"
    printf %s\\n "%%" "int main(int argc, char** argv) { return yylex(); }"
  } | lex -t)
}

$ time printf %s\\n AB{0000..9999} | treplaceholder A B B A > /dev/null

real    0m0.039s
user    0m0.041s
sys     0m0.031s

Shell Script – How to Delete the Last Character of a String

In a POSIX shell, the syntax ${t:-2} means something different - it expands to the value of t if t is set and non null, and otherwise to the value 2. To trim a single character by parameter expansion, the syntax you probably want is ${t%?}

Note that in ksh93, bash or zsh, ${t:(-2)} or ${t: -2} (note the space) are legal as a substring expansion but are probably not what you want, since they return the substring starting at a position 2 characters in from the end (i.e. it removes the first character i of the string ijk).

See the Shell Parameter Expansion section of the Bash Reference Manual for more info:

Bash Reference Manual – Shell Parameter Expansion

Best Answer

Related Solutions

Bash – Replace multiple strings in a single pass

Shell Script – How to Delete the Last Character of a String

Related Question