Separating the JSON from the rest is quite easy. This will give you the non JSON only:
python submit.py --provider gt --assignment error-check | sed '/{/,$d'
And this, only the JSON:
python submit.py --provider gt --assignment error-check | sed -n '/{/,$p'
To illustrate, I have saved your example input as file
and:
$ sed '/{/,$d' file
Problem,Correct?,Correct Answer,Agent's Answer
"Challenge Problem B-04",0,4,-1
"Basic Problem B-12",0,1,-1
"Challenge Problem B-05",0,6,-1
"Challenge Problem B-07",0,6,-1
"Challenge Problem B-06",0,3,-1
"Basic Problem B-11",0,1,-1
"Basic Problem B-10",0,3,-1
"Challenge Problem B-03",0,3,-1
"Challenge Problem B-02",0,1,-1
"Challenge Problem B-01",0,6,-1
"Challenge Problem B-09",0,4,-1
"Challenge Problem B-08",0,4,-1
"Basic Problem B-08",0,6,-1
"Basic Problem B-09",0,5,-1
"Basic Problem B-04",0,3,-1
"Basic Problem B-05",0,4,-1
"Basic Problem B-06",0,5,-1
"Basic Problem B-07",0,6,-1
"Basic Problem B-01",0,2,-1
"Basic Problem B-02",0,5,-1
"Basic Problem B-03",0,1,-1
"Challenge Problem B-10",0,4,-1
"Challenge Problem B-11",0,5,-1
"Challenge Problem B-12",0,1,-1
And
$ sed -n '/{/,$p' file
{
"Basic Problems B": {
"Incorrect": "0",
"Skipped": "12",
"Correct": "0",
"Set": "Basic Problems B"
},
"Challenge Problems B": {
"Incorrect": "0",
"Skipped": "12",
"Correct": "0",
"Set": "Challenge Problems B"
}
}
Now, you already deal with the non-JSON perfectly well, so I won't change that. Ideally, the JSON data should be parsed using a JSON parser, like jq
. Sadly, I don't know enough jq
to do this properly, so the best I could come up with is this, rather inelegant, solution. At least it does do what you want (replace cat file
with your python submit.py --provider gt --assignment error-check
command:
$ cat file | sed -n 's/[,"]//g; s/^ *//; /{/,$p' | tac | awk -F': ' 'BEGIN{printf "%-30s%-10s%-10s%-10s\n", "Set", "Incorrect", "Skipped", "Correct"} NF==2 && !/\{/{if($1=="Set"){set=$2;data[set]["Incorrect"] = 0;data[set]["Skipped"] = 0;data[set]["Correct"] = 0;} data[set][$1]=$2}END{for(set in data){printf "%-30s%-10s%-10s%-10s\n", set,data[set]["Incorrect"],data[set]["Skipped"],data[set]["Correct"]}}'
Set Incorrect Skipped Correct
Challenge Problems B 0 12 0
Basic Problems B 0 12 0
Putting all this together in a shell script gives:
#!/bin/bash
tmpFile=$(mktemp)
python submit.py --provider gt --assignment error-check > "$tmpFile";
sed '/{/,$d' "$tmpFile" | column -t -s,
sed -n 's/[,"]//g; s/^ *//; /{/,$p' "$tmpFile" |
tac |
awk -F': ' '
BEGIN{
printf "%-30s%-10s%-10s%-10s\n", "Set", "Incorrect", "Skipped", "Correct"
}
NF==2 && !/\{/{
if($1=="Set"){
set=$2;
data[set]["Incorrect"] = 0;
data[set]["Skipped"] = 0;
data[set]["Correct"] = 0;
}
data[set][$1]=$2
}
END{
for(set in data){
printf "%-30s%-10s%-10s%-10s\n", set,
data[set]["Incorrect"],
data[set]["Skipped"],
data[set]["Correct"]}
}'
rm "$tmpFile"
Which produces the following output:
$ foo.sh
Problem Correct? Correct Answer Agent's Answer
"Challenge Problem B-04" 0 4 -1
"Basic Problem B-12" 0 1 -1
"Challenge Problem B-05" 0 6 -1
"Challenge Problem B-07" 0 6 -1
"Challenge Problem B-06" 0 3 -1
"Basic Problem B-11" 0 1 -1
"Basic Problem B-10" 0 3 -1
"Challenge Problem B-03" 0 3 -1
"Challenge Problem B-02" 0 1 -1
"Challenge Problem B-01" 0 6 -1
"Challenge Problem B-09" 0 4 -1
"Challenge Problem B-08" 0 4 -1
"Basic Problem B-08" 0 6 -1
"Basic Problem B-09" 0 5 -1
"Basic Problem B-04" 0 3 -1
"Basic Problem B-05" 0 4 -1
"Basic Problem B-06" 0 5 -1
"Basic Problem B-07" 0 6 -1
"Basic Problem B-01" 0 2 -1
"Basic Problem B-02" 0 5 -1
"Basic Problem B-03" 0 1 -1
"Challenge Problem B-10" 0 4 -1
"Challenge Problem B-11" 0 5 -1
"Challenge Problem B-12" 0 1 -1
Set Incorrect Skipped Correct
Challenge Problems B 0 12 0
Basic Problems B 0 12 0
It feels hacky though, and I hope someone can come up with a cleaner solution with dedicated JSON parsers.
Steeldriver was nice enough to give a proper jq
solution in a comment, so if we incorporate that, we get the far simpler (and safer):
#!/bin/bash
tmpFile=$(mktemp)
python submit.py --provider gt --assignment error-check > "$tmpFile";
sed '/{/,$d' "$tmpFile" | column -t -s,
sed -n '/{/,$p' "$tmpFile" |
jq -r '["Set","Incorrect","Skipped","Correct"], (.[] | [.Set,.Incorrect,.Skipped,.Correct]) | @tsv'
rm "$tmpFile"
Best Answer
Attempt 1
A solution using just perl, returning a simple hash of hashes structure. Before the OP clarified data format of JSON.
File::Find
module works in a similar way to the unixfind
command. TheJSON
module takes perl variables and converts them into JSON.Will iterate down the file structure from the present working directory calling the subroutine
process_dir
for each file/directory under ".", and theno_chdir
tell perl not to issue achdir()
for each directory it finds.process_dir
returns if the present examined file is not a directory:We then grab a reference of the existing hash
%$dirs
into$ref
, split the file path around/
and loop withfor
adding a new hash key for each path.Making a directory structure like slm did:
The output is:
Attempt 2
Okay now with different data structure...
And then running the script on the proposed directory structure...
I found this pretty damn tricky to get right (especially given the "hash if sub directories, array if not, OH UNLESS top level, then just hashes anyway" logic). So I'd be surprised if this was something you could do with
sed
/awk
... but then Stephane hasn't looked at this yet I bet :)