You could use the utility program pdfnup
from the pdfjam suite.
pdfnup in.pdf --nup 3x3
should output the file in-nup.pdf with the pages of in.pdf arranged in a series of pages with a 3x3 matrix from the origin pdf.
You should merge all of you pdf files in an only one, also you must want to specify a paper size for the output file, see the pdfjam docs fot the details.
The following awk
program stores a count for how many times each set of three consecutive words occurs (after removing punctuation characters), and prints the counts and the set of words at the end if the count is larger than 1:
{
gsub("[[:punct:]]", "")
for (i = 3; i <= NF; ++i)
w[$(i-2),$(i-1),$i]++
}
END {
for (key in w) {
count = w[key]
if (count > 1) {
gsub(SUBSEP," ",key)
print count, key
}
}
}
Given the text in your question, this produces
2 Search Inside Yourself
2 Cultivate The Three
2 The Three Essential
2 Joy on Demand
2 Recognize and Cultivate
2 Three Essential Virtues
2 and Cultivate The
2 The Ideal Team
3 Ideal Team Player
As you can see, this may not be so useful.
Instead, we can collect the same count information and then do a second pass over the file, printing each line that contains a word triplet with a count larger than one:
NR == FNR {
gsub("[[:punct:]]", "")
for (i = 3; i <= NF; ++i)
w[$(i-2),$(i-1),$i]++
next
}
{
orig = $0
gsub("[[:punct:]]", "")
for (i = 3; i <= NF; ++i)
if (w[$(i-2),$(i-1),$i] > 1) {
print orig
next
}
}
Testing on your file:
$ cat file
The Ideal Team Player
The Ideal Team Player: How to Recognize and Cultivate The Three Essential Virtues
Ideal Team Player: Recognize and Cultivate The Three Essential Virtues
Joy on Demand: The Art of Discovering the Happiness Within
Crucial Conversations Tools for Talking When Stakes Are High
Joy on Demand
Search Inside Yourself: The Unexpected Path to Achieving Success, Happiness
Search Inside Yourself
$ awk -f script.awk file file
The Ideal Team Player
The Ideal Team Player: How to Recognize and Cultivate The Three Essential Virtues
Ideal Team Player: Recognize and Cultivate The Three Essential Virtues
Joy on Demand: The Art of Discovering the Happiness Within
Joy on Demand
Search Inside Yourself: The Unexpected Path to Achieving Success, Happiness
Search Inside Yourself
Caveat: This awk
program needs enough memory to store the text of your file about three times over, and may find duplicates in common phrases even when the entries are actually not truly duplicated (e.g. "how to cook" may be part of the titles of several books).
Best Answer
Digikam
Add all the photos to your collection. In the menu, select “Tools / Find duplicates”. This will look for duplicates accross your whole collection.
Findimagedupes
A command line tool. Pass all the images you want to compare on the command line.
Geeqie (formerly gqview)
In the menu, select “File / Find duplicate”. Drag and drop image files do the duplicates window. You can drop directories to add their contents recursively.
Fdupes
A command line tool to find byte-for-byte duplicates in a directory tree.
(Reposted from https://askubuntu.com/questions/4072/how-can-i-find-duplicate-photos)