How can I select the lines from my text files similar to this one
"created_at": "Wed Oct 19 12:36:54 +0000 2016"
basically I need to find lines with the pattern
- starts with
Wed Oct 19
and - ends with
2016
However, the Wed Oct 19 12:36:54 +0000 2016
could be anywhere in the line and any other time of the day could be in between.
When I use
grep -irn "Wed Oct 19" | grep -irn "2016"
I get all sorts of unwanted results.
Here's an example of a similar line from the file I don't want to match:
"created_at": "Tue Jan 31 18:50:26 +0000 2012",
Thid is part of a tweet's attributes.
Here's a longer part of the input:
"contributors": null,
"retweeted": false,
"in_reply_to_user_id_str": null,
"place": null,
"retweet_count": 4,
"created_at": "Sun Apr 03 23:48:36 +0000 2011",
"retweeted_status": {
"text": "In preparation for the NFL lockout, I will be spending twice as much time analyzing my fantasy baseball team during company time. #PGP",
"truncated": false,
"in_reply_to_user_id": null,
"in_reply_to_status_id": null,
complete example input here:
https://gist.github.com/hrp/900964
UPDATE: I am looking for the file names that contain this pattern in them.
Best Answer
If it could be anywhere in the line, and anything could be in between, I guess
should get it...
If you only want the filenames, use
-l
Notes
-w
use word boundaries in case the text you want is stuck onto something else we don't want to match (unlikely in this case)-l
just print the filenames of files that contain the match.*
any number of any characters hereIt's probably OK to parse this file with
grep
especially for something so simple, but usinga JSON parser as mentioned in David Foerster's answer is the Right Way (i.e. it will likely be more reliable, especially if you need to do anything complex).