Mapping JSON to spreadsheet
This tutorial covers how to map JSON document to a spreadsheet document, very similar to what we covered in this tutorial where we illustrated how to map XML document to a spreadsheet document.
Throughout this tutorial, we will be using this sample JSON document
to illustrate how to achieve it using the orcus-json
command. The structure
of this tutorial will be similar to the structure of the XML mapping counterpart,
since the steps are very similar.
Examining the structure of the input JSON document
Let’s first take a look at the sample JSON document:
[
{
"id": 1,
"name": [
"Tab",
"Limpenny"
],
"active": true,
"gender": "Male",
"language": "Kazakh"
},
{
"id": 2,
"name": [
"Manda",
"Hadgraft"
],
"active": false,
"gender": "Female",
"language": "Bislama"
},
{
"id": 3,
"name": [
"Mickie",
"Boreham"
],
"active": false,
"gender": "Male",
"language": "Swahili"
},
...
This is essentially the same content as the XML sample document we used in the last tutorial but re-formatted in JSON.
Let run the following command:
orcus-json --mode structure example.json
to analyze the structure of this JSON document. The command will generate the following output:
$array[20].object(*)['active'].value
$array[20].object(*)['gender'].value
$array[20].object(*)['id'].value
$array[20].object(*)['language'].value
$array[20].object(*)['name'].array[2].value[0,1]
This structure output resembles a variant of JSONPath but some modifications are applied. It has the following characteristics:
The
$
symbol represents the root of the structure.Array node takes the form of either
array[N]
, where the value ofN
represents the number of elements.Object node takes the form of
object['key']
.Value node, which is always a leaf node, is represented by
value
except when the leaf node is an array containing values, it takes the form ofvalue[0,1,2,...]
.The
.
symbols represent the node boundaries.The
(*)
symbols represent recurring nodes, which can be either array or object.
Auto-mapping the JSON document
Let’s map this JSON document to a spreadsheet document by running:
orcus-json --mode map -o out -f flat example.json
This is very similar to what we did in the XML mapping tutorial, except that
the command used is orcus-json
and the input file is example.json
.
This will create file named out/range-0.txt
which contains the following:
---
Sheet name: range-0
rows: 21 cols: 6
+--------+-----------+-------------+-----------+--------+----------------+
| id | field 0 | field 1 | active | gender | language |
+--------+-----------+-------------+-----------+--------+----------------+
| 1 [v] | Tab | Limpenny | true [b] | Male | Kazakh |
+--------+-----------+-------------+-----------+--------+----------------+
| 2 [v] | Manda | Hadgraft | false [b] | Female | Bislama |
+--------+-----------+-------------+-----------+--------+----------------+
| 3 [v] | Mickie | Boreham | false [b] | Male | Swahili |
+--------+-----------+-------------+-----------+--------+----------------+
| 4 [v] | Celinka | Brookfield | false [b] | Female | Gagauz |
+--------+-----------+-------------+-----------+--------+----------------+
| 5 [v] | Muffin | Bleas | false [b] | Female | Hiri Motu |
+--------+-----------+-------------+-----------+--------+----------------+
| 6 [v] | Jackelyn | Crumb | false [b] | Female | Northern Sotho |
+--------+-----------+-------------+-----------+--------+----------------+
| 7 [v] | Tessie | Hollingsbee | true [b] | Female | Fijian |
+--------+-----------+-------------+-----------+--------+----------------+
| 8 [v] | Yank | Wernham | false [b] | Male | Tok Pisin |
+--------+-----------+-------------+-----------+--------+----------------+
| 9 [v] | Brendan | Lello | true [b] | Male | Fijian |
+--------+-----------+-------------+-----------+--------+----------------+
| 10 [v] | Arabel | Rigg | false [b] | Female | Kyrgyz |
+--------+-----------+-------------+-----------+--------+----------------+
| 11 [v] | Carolann | McElory | false [b] | Female | Pashto |
+--------+-----------+-------------+-----------+--------+----------------+
| 12 [v] | Gasparo | Flack | false [b] | Male | Telugu |
+--------+-----------+-------------+-----------+--------+----------------+
| 13 [v] | Eolanda | Polendine | false [b] | Female | Kashmiri |
+--------+-----------+-------------+-----------+--------+----------------+
| 14 [v] | Brock | McCaw | false [b] | Male | Tsonga |
+--------+-----------+-------------+-----------+--------+----------------+
| 15 [v] | Wenda | Espinas | false [b] | Female | Bulgarian |
+--------+-----------+-------------+-----------+--------+----------------+
| 16 [v] | Zachary | Banane | true [b] | Male | Persian |
+--------+-----------+-------------+-----------+--------+----------------+
| 17 [v] | Sallyanne | Mengue | false [b] | Female | Latvian |
+--------+-----------+-------------+-----------+--------+----------------+
| 18 [v] | Elizabet | Hoofe | true [b] | Female | Tswana |
+--------+-----------+-------------+-----------+--------+----------------+
| 19 [v] | Alastair | Hutchence | true [b] | Male | Ndebele |
+--------+-----------+-------------+-----------+--------+----------------+
| 20 [v] | Minor | Worland | true [b] | Male | Dutch |
+--------+-----------+-------------+-----------+--------+----------------+
Again, this is very similar to what we saw in the XML-mapping example. Note
that cell values with [v]
and [b]
indicate numeric and boolean values,
respectively. Cells with no suffixes are string cells.
Custom-mapping using map file
This process is also very similar to the process we followed for XML mapping. We first auto-generate a map file, modify it, and use it to do the mapping again. Since there isn’t much difference between XML mapping and JSON mapping, let’s just go through this very quick.
First step is to generate a map file for the auto-detected range by running:
orcus-json --mode map-gen -o map.json example.json
which will write the mapping rules to map.json
file. When you open the generated
map file, you will see something like the following:
{
"sheets": [
"range-0"
],
"ranges": [
{
"sheet": "range-0",
"row": 0,
"column": 0,
"row-header": true,
"fields": [
{
"path": "$[]['id']"
},
{
"path": "$[]['name'][0]"
},
{
"path": "$[]['name'][1]"
},
{
"path": "$[]['active']"
},
{
"path": "$[]['gender']"
},
{
"path": "$[]['language']"
}
],
"row-groups": [
{
"path": "$"
}
]
}
]
}
The structure and content of the map file should look similar to the XML counterpart,
except that it is now in JSON format, and the paths are expressed in slightly
modified JSONPath bracket notation, where []
represents an array node with
no position specified.
Now that we have a map file, let’s modify this and use it to do the mapping once again. Just like the XML mapping example, we are going to:
insert two blank rows above,
drop the
id
andactive
fields,specify labels for the fields, and
change the sheet name from
range-0
toMy Data
.
This is what we’ve come up with:
{
"sheets": [
"My Data"
],
"ranges": [
{
"sheet": "My Data",
"row": 2,
"column": 0,
"row-header": true,
"fields": [
{
"path": "$[]['name'][0]", "label": "First Name"
},
{
"path": "$[]['name'][1]", "label": "Last Name"
},
{
"path": "$[]['gender']", "label": "Gender"
},
{
"path": "$[]['language']", "label": "Language"
}
],
"row-groups": [
{
"path": "$"
}
]
}
]
}
We’ll save this file as map-modified.json
, and pass it to the orcus-json
command via --map
or -m
option:
orcus-json --mode map -o out -f flat -m map-modified.json example.json
Let’s check the output in out/My Data.txt
and see what it contains:
---
Sheet name: My Data
rows: 23 cols: 4
+------------+-------------+--------+----------------+
| | | | |
+------------+-------------+--------+----------------+
| | | | |
+------------+-------------+--------+----------------+
| First Name | Last Name | Gender | Language |
+------------+-------------+--------+----------------+
| Tab | Limpenny | Male | Kazakh |
+------------+-------------+--------+----------------+
| Manda | Hadgraft | Female | Bislama |
+------------+-------------+--------+----------------+
| Mickie | Boreham | Male | Swahili |
+------------+-------------+--------+----------------+
| Celinka | Brookfield | Female | Gagauz |
+------------+-------------+--------+----------------+
| Muffin | Bleas | Female | Hiri Motu |
+------------+-------------+--------+----------------+
| Jackelyn | Crumb | Female | Northern Sotho |
+------------+-------------+--------+----------------+
| Tessie | Hollingsbee | Female | Fijian |
+------------+-------------+--------+----------------+
| Yank | Wernham | Male | Tok Pisin |
+------------+-------------+--------+----------------+
| Brendan | Lello | Male | Fijian |
+------------+-------------+--------+----------------+
| Arabel | Rigg | Female | Kyrgyz |
+------------+-------------+--------+----------------+
| Carolann | McElory | Female | Pashto |
+------------+-------------+--------+----------------+
| Gasparo | Flack | Male | Telugu |
+------------+-------------+--------+----------------+
| Eolanda | Polendine | Female | Kashmiri |
+------------+-------------+--------+----------------+
| Brock | McCaw | Male | Tsonga |
+------------+-------------+--------+----------------+
| Wenda | Espinas | Female | Bulgarian |
+------------+-------------+--------+----------------+
| Zachary | Banane | Male | Persian |
+------------+-------------+--------+----------------+
| Sallyanne | Mengue | Female | Latvian |
+------------+-------------+--------+----------------+
| Elizabet | Hoofe | Female | Tswana |
+------------+-------------+--------+----------------+
| Alastair | Hutchence | Male | Ndebele |
+------------+-------------+--------+----------------+
| Minor | Worland | Male | Dutch |
+------------+-------------+--------+----------------+
The id
and active
fields are gone, the remaining fields have custom
labels we specified, and there are two blank rows above. It appears that all
the changes we have intended have been properly applied.