How to package a JavaScript library for use in BigQuery

Evgeny Medvedev
The Startup
Published in
3 min readFeb 8, 2020

--

In this article you’ll learn how to package a JavaScript library for use in a BigQuery UDF. We’ll consider a particular example — ethers.js —a complete Ethereum library and wallet implementation in JavaScript. It will allow us to decode raw transactions and logs data in the bigquery-public-data.crypto_ethereum dataset in BigQuery.

The whole process can be broken down into three steps:

  1. Create a package.js file with the JavaScript lib dependency.
  2. Create webpack.config.js and build a JS file using webpack.
  3. Upload the generated JS file to GCS and use it in BigQuery UDF.

You can find the source code for bundling and using ethers.js in this Github repo: https://github.com/blockchain-etl/ethers.js-bigquery.

Now, let’s go through each step in detail.

Create a package.js file

In the image below I highlighted the webpack build script, which we will use later; the dependency on the “ethers” library with its version; and dependencies on webpack and webpack-cli:

https://github.com/blockchain-etl/ethers.js-bigquery/blob/master/package.json

Run npm install to download the dependencies.

Create webpack.config.js and build a JS file using webpack

The image below highlights the entry point, starting from which webpack will aggregate all transitive dependencies and assemble them in a single file:

https://github.com/blockchain-etl/ethers.js-bigquery/blob/master/webpack.config.js

The output configuration tells it to save the result todist/ethers.js file, and that the library will be available as ethers variable when included in a UDF.

Run npm run-script build to generate dist/ethers.js.

Upload the generated JS file to GCS and use it in BigQuery UDF

You’ll need to create a GCS bucket, upload the generated file to it, and make the file publicly readable:

gsutil mb gs://your-bucket
gsutil cp dist/ethers.js gs://your-bucket
gsutil acl ch -u AllUsers:R gs://your-bucket/ethers.js

Now you are ready to use it in a UDF for parsing Ethereum data. The image below highlights the usage of the ethers variable available inside the UDF; how the library is included; and the logs table in the crypto_ethereum dataset:

https://gist.github.com/medvedev1088/2c4bca11b9a0a7fc32ff947613b17d94

You can run it by pasting into the BigQuery console query
editor: https://console.cloud.google.com/bigquery. And this is how the result looks like (all CryptoKitties Transfer events):

Also read this article from my fellow D5 member Alex Svanevik on How to get any Ethereum smart contract into BigQuery (in 8 mins).

--

--