Shell – Equivalent of Java’s String.getBytes() in Unix Shell (Cygwin)

binaryjavaopensslshell

Let's say I convert my string into byte array.

byte[] byte sUserID.getBytes(“UTF-8”);  //Convert User ID String to byte array

Now I need to write a script on Shell that will have exactly the same functionality as my Java code. At some stage I must hash my byte array (using MessageDigest.getInstance(“SHA-256”) in Java and openssl dgst -sha256 –binary in Shell), but because digests in Java code are generated from byte arrays, they won’t match results I get in Shell (in Shell I simply hash strings at the moment, so input formats don't match).

Because my input for openssl in shell should be similar to Java input I want to know whether there is a way to “simulate” getBytes() method in Shell? I don’t have much experience in Shell so I don’t know what could be the best approach in this case. Any ideas? Cheers!

Best Answer

openssl's stdin is a byte stream.

The contents of $user is a sequence of non-0 bytes (which may or may not form valid characters in UTF-8 or other character set/encoding).

printf %s "$user"'s stdout is a byte stream.

printf %s "$user" | openssl dgst -sha256 –binary

Will connect printf's stdout with openssl's stdin. openssl's stdout is another byte stream.

Now, if you're inputing $user from the user from a terminal, The user will enter it by pressing keys on his keyboard. The terminal will send corresponding characters (as written on the key label) encoded in its configured character set. Usually, that character set will be based on the character set in the current locale. You can find what that is with locale charmap.

For instance, with a locale like fr_FR.iso885915@euro, and an xterm started in that locale, locale charmap will return ISO-8859-15. If the user enters stéphane as the username, that é will likely be encoded as the 0xe9 byte because that's how it's defined in the ISO-8859-15 character set.

If you want that é to be encoded as UTF-8 before passing to openssl, that's where you'd use iconv to convert that 0xe9 byte to the corresponding encoding in UTF-8 (two bytes: 0xc3 0xa9):

IFS= read -r user # read username from stdin as a sequence of bytes
                  # assumed to be encoded from characters as per the
                  # locale's encoding
printf %s "$user" |
  iconv -t utf-8 | # convert from locale encoding to UTF-8
  openssl dgst -sha256 –binary

Related Solutions

Shell – Unix equivalent of PowerShell

No, it is the other way around. There is no spoon^H^H^Hstructured data. There is only text.

A big part of the Unix philosophy is based on the idea of outputting text and accepting text as input. You might want to consider reading "The Art of Unix Programming", which has a nice explanation about this.

Don't get me wrong: I understand your point and I know what you are trying to get at. There are things like the interactive interpreters of Ruby and Python, which can be used as a shell, but they are not as friendly for basic tasks as Bash is. Try and change directory, for example.

Also, using objects in a shell is not all-that. If only your shell supports this, on Unix, you would be at a loss. All the standard Unix text manipulation tools would have to be altered, like grep, awk, sed, etc.

I think there has been an attempt to create something like this a few years back, but I can't remember the name and I haven't heard about it in a long time. It's probably not going to take off.

Shell – Equivalent of forward/back buttons for unix shell (when navigating directories)

pushd and popd can be very useful. For example try


$ pushd somedir

and when you are done, just do


$ popd

and you are back where you started.

The best part is you can do


$ pushd somedir

$ pushd anotherdir

$ pushd onemoredir

and then you can "step back" one at a time using popd.

Good luck!

Best Answer

Related Solutions

Shell – Unix equivalent of PowerShell

Shell – Equivalent of forward/back buttons for unix shell (when navigating directories)

Related Question