[SANS ISC] Filter JSON Data by Value with Linux jq, (Sun, Aug 29th)

Since JSON has become more prevalent as a data service, unfortunately, it isn’t at all BASH friendly and manipulating JSON data at the command line with REGEX (i.e. sed, grep, etc.) is cumbersome and difficult to get the output I want.

So, there is a Linux tool I use for this, jq is a tool specifically written to manipulate and filter the data I want (i.e. like scripting and extract the output I need) from large JSON file in an output format I can easily read and manipulate.

The most common form of logs I work with are JSON arrays (start and end []). For example, using a basic example like this to demonstrate how to iterate over an array:

echo ‘[“a”,”b”,”c”]’ | jq ‘.[]’

which will result to this using the object value iterator operator .[] will print each item in the array on a separate line:


In this next example, I take the data from the bot_ip.json file, parse the list of IP addresses and which site they came from. Before parsing this file, here is how the raw output of the file starts:

cat bot_ip.json | jq ‘.objects[].ip + “: ” + .objects[].source’ | sort | uniq

The output looks like this:

“ Botscout BOT IPs”
“ Botscout BOT IPs”
“2607:90:6628:470:0:4:0:801: Botscout BOT IPs”

Since this file contains objects before the open [, I use it as an anchor to start parsing the data I want to see. I added the column (:) separator between the IP and the data source.

This second example is with mal_url.json which contain know malware URL location. Before parsing this file, here is how the file starts:

cat mal_url.json | jq ‘.objects[].value + “: ” + .objects[].source + “: ” + .objects[].threat_type’ | sort | uniq

“ URLHaus: malware”
“ URLHaus: malware”
“ URLHaus: malware”

Using this test file available here, it contains several records that can be used to manipulate JSON data. Using wget, download the file to a Linux workstation [2] and ensure that jq is already installed (i.e. CentOS:  yum -y install jq). Next take a quick look at the raw file using a Linux command of your choice (less, more, cat, etc) before parsing some of the data with jq. To view the data properly formatted and readable, use this command:

cat large-file.json | jq | more

Manipulate the data to get a list of actors with the current information, run this command:

cat large-file.json | jq ‘.[].actor’ | more

To get just the list of actor login information, add .login to .actor:

cat large-file.json | jq ‘.[].actor.login’ | more

What are some of your favorite tools to manipulate JSON data?

[1] https://stedolan.github.io/jq/manual/
[2] https://github.com/json-iterator/test-data/raw/master/large-file.json
[3] https://gchq.github.io/CyberChef/
[4] http://iplists.firehol.org/?ipset=botscout

Guy Bruneau IPSS Inc.
My Handler Page
Twitter: GuyBruneau
gbruneau at isc dot sans dot edu

(c) SANS Internet Storm Center. https://isc.sans.edu Creative Commons Attribution-Noncommercial 3.0 United States License.

Source: Read More (SANS Internet Storm Center, InfoCON: green)

You might be interested in …

[HackerNews] U.S. Authorities Shut Down Slilpp—Largest Marketplace for Stolen Logins

All posts, HackerNews

The U.S. Department of Justice (DoJ) Thursday said it disrupted and took down the infrastructure of an underground marketplace known as “Slilpp” that specialized in trading stolen login credentials as part of an international law enforcement operation. Over a dozen individuals have been charged or arrested in connection with the illegal marketplace. The cyber crackdown, […]

Read More

[ZDNet] Crackonosh malware abuses Windows Safe mode to quietly mine for cryptocurrency

All posts, ZDNet

The malware is thought to have generated millions of dollars in just a few short years. Source: Read More (Latest topics for ZDNet in Security)

Read More

[ThreatPost] DarkSide Wanted Money, Not Disruption from Colonial Pipeline Attack

All posts, ThreatPost

Statement by the ransomware gang suggests that the incident that crippled a major U.S. oil pipeline may not have exactly gone to plan for overseas threat actors. Source: Read More (Threatpost)

Read More

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.