jq as a time saving tool

Girish V P
ADIBI Technologies, Basavanagudi, Blore.
2 min readMar 4, 2024

To search and manipulate JSON based contents is very hard, especially when file size is huge. I dont want to use traditional text processing commands and tools like cut, grep, sed, awk etc. Instead, jq was helpful in all scenarios where I need to filter through the complex json structures. Let me use some AWS Kinesis data stream cli commands combined with jq. Assumed that a Kinesis data stream named my-data-stream is created and jq tool is installed in Linux OS.

  1. List all Kinesis data streams. First command returns json output with overwhelming details. Second command filters json and returns only stream name.
$ aws kinesis list-streams
$ aws kinesis list-streams | jq '.StreamSummaries | .[].StreamName'

2 . To see the details about Kinesis data stream

$ aws kinesis describe-stream   --stream-name  my-kinesis-stream 

3. To list all Kinesis shards. First command returns json output with much information . I have to combine the command with jq to list the shard names only.

$ aws kinesis describe-stream
$ aws kinesis describe-stream --stream-name my-kinesis-stream | jq '.StreamDescription.Shards | .[].ShardId'

4. To show number of open shards in the Kinesis data stream. There is no single command. You have to combine with jq

aws kinesis describe-stream-summary --stream-name my-kinesis-stream | jq ' .StreamDescriptionSummary | .OpenShardCount '

5 . To put a record in Kinesis stream. This returns shardId with a Sequence Number.

$ aws kinesis put-record --stream-name my-kinesis-stream --data "my record 1" --partition-key my-key-1  --cli-binary-format raw-in-base64-out

6 . To read the Kinesis records. To read the records, first we need to find the shard iterator . Execute the command below.

aws kinesis get-shard-iterator  --stream-name my-kinesis-stream --shard-id shardId-000000000000 --shard-iterator-type TRIM_HORIZON

7. Now read all the Kinesis records . Use the shard iterator returned from above command. To filter data, you can combine the command with jq here as well. Output is always in base64 encoded format

aws kinesis get-records --shard-iterator returned_shard_iterator|  jq '. | .Records | .[] | .Data'
Output: 
Output looks like below. It means, I had used put-record request nine times.

"cmVjb3JkIDE="
"cmVjb3JkIDI="
"cmVjb3JkIDM="
"cmVjb3JkIDEx"
"cmVjb3JkIDEx"
"bXkgcmVjb3JkIDEx"
"SGVsbG8sICAxMQ=="
"R3JlZXRpbnMgZm9ybSBLaW5lc2lz"
"bXkgcmVjb3JkIDE="

Finally, Getting the kinesis data stream record in clear text: We can enhance the above command to get records in the clear text format.

$ aws kinesis get-records --shard-iterator returned_shard_iterator|  jq '. | .Records | .[] | .Data'  > tempfile ; for string in `cat tempfile` ; do echo -e "${string:1:${#string}-2}"    | base64 --decode ; echo -e '\n'; done
Output in cleartext. I have used some random strings while inputing the records to the kinesis data stream.

record 1
record 2
record 3
record 11
record 11
my record 11
Hello, 11
Greetins form Kinesis
my record 1

Conclusion: We found that jq is time saving tool when the user is working with a complex json file.

Disclaimer: It is recommended to verify the commands thoroughly before implementing in the production environment.

--

--