Assuming both files are not huge, the python script below will do the job as well.
How it works
Both files are read by the script. The lines in file_1 (the file that has precedence) is split by the directory you entered for the file in the head section (in your example /mnt/app/
).
Subsequently, the lines in file_1 are written to the output file (the merged file). At the same time, lines from file_2 are removed from the line list if the identifying string (the section after the mount point) occurs in the line.
Finally, the "remaining" lines of file_2 (of which no dupe exist in file_1) are written to the output file as well. The result:
file_1:
1058,b8203a236b4f15316e516165a6546666,/mnt/app/Certificados/ca.crt
2694,8a815adefde4fa0c263e74832b15de64,/mnt/app/Certificados/ca.db.certs/01.pem
136,77bf2e5313dbaac4df76a4b72df2e2ad,/mnt/app/Certificados/ca.db.index
file_2:
1058,b8203a236b4f15316e516165a6546666,/mnt/app/Certificados/ca.crt
3,72b2ac90f7f3ff075a937d6be8fc3dc3,/mnt/temp/Certificados/ca.db.serial
2694,8a815adefde4fa0c263e74832b15de64,/mnt/app/Certificados/ca.db.certs/01.pem
136,77bf2e5313dbaac4df76a4b72df2e2ad,/mnt/app/Certificados/ca.db.index
merged:
1058,b8203a236b4f15316e516165a6546666,/mnt/app/Certificados/ca.crt
2694,8a815adefde4fa0c263e74832b15de64,/mnt/app/Certificados/ca.db.certs/01.pem
136,77bf2e5313dbaac4df76a4b72df2e2ad,/mnt/app/Certificados/ca.db.index
3,72b2ac90f7f3ff075a937d6be8fc3dc3,/mnt/temp/Certificados/ca.db.serial
The script
#!/usr/bin/env python3
#---set the path to file1, file2 and the mountpoint used in file1 below
f1 = "/path/to/file_1"; m_point = "/mnt/app"; f2 = "/path/to/file_2"
merged = "/path/to/merged_file"
#---
lines1 = [(l, l.split(m_point)[-1]) for l in open(f1).read().splitlines()]
lines2 = [l for l in open(f2).read().splitlines()]
for l in lines1:
open(merged, "a+").write(l[0]+"\n")
for line in [line for line in lines2 if l[1] in line]:
lines2.remove(line)
for l in lines2:
open(merged, "a+").write(l+"\n")
How to use
- Copy the script into an empty file, save it as
merge.py
- in the head section of the script, set the paths to
f1
(file_1
), f2
, the path to the merging file and the mountpoint as mentioned in file_1
.
Run it by the command:
python3 /path/to/merge.py
Edit
Or a tiny bit shorter:
#!/usr/bin/env python3
#---set the path to file1, file2 and the mountpoint used in file1 below
f1 = "/path/to/file_1"; m_point = "/mnt/app"; f2 = "/path/to/file_2"
merged = "/path/to/merged_file"
#---
lines = lambda f: [l for l in open(f).read().splitlines()]
lines1 = lines(f1); lines2 = lines(f2); checks = [l.split(m_point)[-1] for l in lines1]
for item in sum([[l for l in lines2 if c in l] for c in checks], []):
lines2.remove(item)
for item in lines1+lines2:
open(merged, "a+").write(item+"\n")
Best Answer
Assuming "inputs are given through command line" means that the program accepts command-line arguments, then you should be able to do this with
xargs
Ex. given a minimal C program
compiled with
gcc -o someprog someprog.c
, and a file of whitespace-separated input argumentsinputs.txt
Then
If your input file has a header, you will need to remove it (using
tail
for example).If parallel processing would be computationally advantageous for your task, then you could do a similar thing with GNU Parallel