Spreadsheet Processing in Mathematica
Mathematica is a software system for mathematics and scientific computing. It supports command-line tools and JavaScript extensions.
SheetJS is a JavaScript library for reading and writing data from spreadsheets.
This demo uses SheetJS to pull data from a spreadsheet for further analysis within Mathematica. We'll explore how to run an external tool to generate CSV data from opaque spreadsheets and parse the data from Mathematica.
This demo was last tested by SheetJS users on 2023 November 04 in Mathematica 13.
Integration Details
The SheetJS NodeJS module can be
loaded in NodeJS scripts, including scripts invoked using the "NodeJS" mode
of the ExternalEvaluate1 Mathematica function.
However, the current cross-platform recommendation involves a dedicated command line tool that leverages SheetJS libraries to to perform spreadsheet processing.
External Engines
The following diagram depicts the workbook waltz:
Mathematica
NodeJS can be activated from Mathematica using RegisterExternalEvaluator2.
Once activated, JavaScript code can be run using ExternalEvaluate3. If the
NodeJS code returns CSV data, ImportString4 can generate a Dataset5.
SheetJS
For a file residing on the filesystem, the SheetJS readFile function6 can
generate a workbook object. The exact location can be determined by printing
require("process").cwd()7 in ExternalEvaluate:
In[1]:= ExternalEvaluate["NodeJS", "require('process').cwd()"]
Out[1]= "C:\Users\Me\Documents"
After pulling the first worksheet8, the SheetJS sheet_to_csv function9
generates a CSV string.
Complete Function
The following function reads a file, parses the first worksheet and returns a Dataset object assuming one header row.
- Linux/MacOS
- Windows
(* Import file stored in the Documents folder (e.g. C:\Users\Me\Documents) *)
SheetJSImportFileEE[filename_]:=Module[{csv}, (
(* This was required in local testing *)
RegisterExternalEvaluator["NodeJS","/usr/local/bin/node"];
(* Generate CSV from first sheet *)
csv:=ExternalEvaluate["NodeJS", StringJoin[
(* module installed in home directory *)
"var XLSX = require('xlsx');",
(* read specified filename *)
"var wb = XLSX.readFile('",filename,"');",
(* grab first worksheet *)
"var ws = wb.Sheets[wb.SheetNames[0]];",
(* convert to CSV *)
"XLSX.utils.sheet_to_csv(ws)"
]];
(* Parse CSV into a dataset *)
Return[ImportString[csv, "Dataset", "HeaderLines"->1]];
)]
(* Import file stored in the Documents folder (e.g. C:\Users\Me\Documents) *)
SheetJSImportFileEE[filename_]:=Module[{csv}, (
(* This was required in local testing *)
RegisterExternalEvaluator["NodeJS","C:\\Program Files\\nodejs\\node.exe"];
(* Generate CSV from first sheet *)
csv:=ExternalEvaluate["NodeJS", StringJoin[
(* module installed in home directory *)
"var XLSX = require('xlsx');",
(* read specified filename *)
"var wb = XLSX.readFile('",filename,"');",
(* grab first worksheet *)
"var ws = wb.Sheets[wb.SheetNames[0]];",
(* convert to CSV *)
"XLSX.utils.sheet_to_csv(ws)"
]];
(* Parse CSV into a dataset *)
Return[ImportString[csv, "Dataset", "HeaderLines"->1]];
)]
How to run the example (click to hide)
This example was last tested on 2023 November 04 with Mathematica 13.3.
0) Install NodeJS. When the demo was tested, version 20.9.0 was installed.
1) Install dependencies in the Home folder (~ or $HOME or %HOMEPATH%):
npm i --save https://cdn.sheetjs.com/xlsx-0.20.1/xlsx-0.20.1.tgz [email protected]
2) Open a new Mathematica Notebook and register NodeJS. When the example was tested in Windows, the commands were:
- Linux/MacOS
- Windows
RegisterExternalEvaluator["NodeJS","/usr/local/bin/node"]
FindExternalEvaluators["NodeJS"]
RegisterExternalEvaluator["NodeJS","C:\\Program Files\\nodejs\\node.exe"]
FindExternalEvaluators["NodeJS"]
The second argument to RegisterExternalEvaluator should be the path to the
node or node.exe binary.
If NodeJS is registered, the value in the "Registered" column will be "True".
4) To determine the base folder, run require("process").cwd() from NodeJS:
ExternalEvaluate["NodeJS", "require('process').cwd()"]
5) Download pres.numbers and move the file
to the base folder as shown in the previous step.
6) Copy and evaluate the "Complete Function" in the previous codeblock.
7) Run the function and confirm the result is a proper Dataset:
SheetJSImportFileEE["pres.numbers"]

Command-Line Tools
The "Command-Line Tools" demo creates xlsx-cli, a
command-line tool that reads a spreadsheet file and generates CSV rows from the
first worksheet.
ExternalEvaluate10 can run command-line tools and capture standard output.
The following snippet processes ~/Downloads/pres.numbers and pulls CSV data
into a variable in Mathematica:
cmd = "/usr/local/bin/xlsx-cli ~/Downloads/pres.numbers"
csvdata = ExternalEvaluate["Shell" -> "StandardOutput", cmd];
ImportString11 can interpret the CSV data as a Dataset12. Typically the
first row of the CSV output is the header row. The HeaderLines13 option
controls how Mathematica parses the data:
data = ImportString[csvdata, "Dataset", "HeaderLines" -> 1]
The following diagram depicts the workbook waltz:
Complete Demo
1) Create the standalone xlsx-cli binary14:
npm i --save https://cdn.sheetjs.com/xlsx-0.20.1/xlsx-0.20.1.tgz exit-on-epipe commander@2
curl -LO https://docs.sheetjs.com/cli/xlsx-cli.js
npx nexe -t 14.15.3 xlsx-cli.js
- Linux/MacOS
- Windows
2) Move the generated xlsx-cli to a fixed location in /usr/local/bin:
mkdir -p /usr/local/bin
mv xlsx-cli /usr/local/bin/
2) Find the current directory:
cd
The generated binary will be xlsx-cli.exe in the displayed path.
Reading a Local File
3) In a new Mathematica notebook, run the following snippet:
SheetJSImportFile[x_] := ImportString[Block[{Print}, ExternalEvaluate[
"Shell" -> "StandardOutput",
"/usr/local/bin/xlsx-cli " <> x
]], "Dataset", "HeaderLines" -> 1]
- Linux/MacOS
- Windows
Change /usr/local/bin/xlsx-cli in the string to the path to the generated
xlsx-cli.exe binary. For example, if the path in step 2 was
C:\Users\Me\Documents\, then the code should be:
SheetJSImportFile[x_] := ImportString[Block[{Print}, ExternalEvaluate[
"Shell" -> "StandardOutput",
"C:\\Users\\Me\\Documents\\xlsx-cli.exe " <> x
]], "Dataset", "HeaderLines" -> 1]
The \ characters must be doubled.
4) Download https://sheetjs.com/pres.numbers and save to Downloads folder:
cd ~/Downloads/
curl -LO https://sheetjs.com/pres.numbers
5) In the Mathematica notebook, run the new function. If the file was saved to
the Downloads folder, the path will be "~/Downloads/pres.numbers" in macOS:
data = SheetJSImportFile["~/Downloads/pres.numbers"]
The result should be displayed in a concise table.

Reading from a URL
FetchURL15 downloads a file from a specified URL and returns a path to the
file. This function will be wrapped in a new function called SheetJSImportURL.
6) In the same notebook, run the following:
Needs["Utilities`URLTools`"];
SheetJSImportURL[x_] := Module[{path},(
path = FetchURL[x];
SheetJSImportFile[path]
)];
7) Test by downloading the test file in the notebook:
data = SheetJSImportURL["https://sheetjs.com/pres.numbers"]

- See the
ExternalEvaluateNode.js example in the Mathematica documentation.↩ - See
RegisterExternalEvaluatorin the Mathematica documentation.↩ - See
ExternalEvaluatein the Mathematica documentation.↩ - See
ImportStringin the Mathematica documentation.↩ - A
Datasetwill be created when using the"Dataset"element inImportString↩ - See
readFilein "Reading Files"↩ - See
process.cwd()in the NodeJS documentation.↩ - The
SheetsandSheetNamesproperties of workbook objects are described in "Workbook Object"↩ - See
sheet_to_csvin "CSV and Text"↩ - See
ExternalEvaluatein the Mathematica documentation.↩ - See
ImportStringin the Mathematica documentation.↩ - A
Datasetwill be created when using the"Dataset"element inImportString↩ - See
HeaderLinesin the Mathematica documentation.↩ - See "Command-line Tools" for more details.↩
- Mathematica 11 introduced new methods including
URLRead.↩