Read JSON from Azure Blob to SQL

Overview

Azure Blob Storage is a highly scalable object storage solution provided by Microsoft Azure. It's commonly used to store large amounts of unstructured data, including JSON files. Integrating Azure Blob Storage with a SQL Database to read JSON data and store it in a structured format can be a powerful way to leverage the flexibility of JSON and the robustness of relational databases. This blog will guide you through the process of reading direct JSON file content from Azure Blob Storage and inserting it into a SQL Database.

These days, working with CSV or JSON data is becoming more and more widespread.Sometimes, especially when debugging or developing a new feature, I need to access that JSON data, before is sent to any further microservices for processing or, after that, being stored in the database.

Considering how often I work with SQL. Therefore, it would be incredibly helpful if it were possible to query JSON directly from T-SQL, eliminating the requirement to download the file from the Azure Blob Store. This will simplify and increase the efficiency of my work.

I would really like to have access to JSON wherever it is, exactly like you do with Hadoop or Azure Data Lake.

Well, you can. I just find out that with the latest additions (added since SQL Server 2017 and already available on Azure SQL v 12.0.2000.8) it is incredibly easy.

PreRequisites

Before we dive in, make sure you have the following:

  • Azure Subscription: An active Azure subscription.
  • Azure Blob Storage Account: A storage account with containers and JSON files.

Step-by-Step Guide

1. Set Up Your Azure Blob Storage

  • Create a Blob Container: Navigate to your Azure Storage Account and create a blob container.
  • Upload JSON Files: Upload your JSON files to the container. Ensure the JSON structure is consistent and ready for processing.
To see the JSON file content, click me 🙋
1[
2 {
3 "Id": 1,
4 "StudentName": "Alexandra",
5 "Age": 33,
6 "Email": "alexandra@xyz.com"
7 },
8 {
9 "Id": 2,
10 "StudentName": "Sophie",
11 "Age": 13,
12 "Email": "sophie@xyz.com"
13 },
14 {
15 "Id": 3,
16 "StudentName": "Sarah",
17 "Age": 23,
18 "Email": "sarah@xyz.com"
19 },
20 {
21 "Id": 4,
22 "StudentName": "Ruth",
23 "Age": 43,
24 "Email": "ruth@xyz.com"
25 },
26 {
27 "Id": 5,
28 "StudentName": "Rose",
29 "Age": 60,
30 "Email": "rose@xyz.com"
31 }
32]

2. Configure SQL Database

  • Create Database: If you haven't already, create an SQL Database.
  • Define Schema: Design the schema in your SQL Database to match the structure of your JSON data. For example:
1CREATE TABLE Students (
2 Id INT PRIMARY KEY,
3 StudentName NVARCHAR(100),
4 Age INT,
5 Email NVARCHAR(100)
6);

First of all the Shared Access Signature needs to be created to allow SQL to access the Azure Blob Store where you have your JSON files. This can be done using the Azure Portal, from the Azure Storage Explorer. SAS_Result

Once you have the signature a Database Scoped Credential that points to the created Shared Access Signature needs to be created too:

1CREATE DATABASE SCOPED CREDENTIAL MyBlobCredential
2WITH IDENTITY = 'SHARED ACCESS SIGNATURE',
3SECRET = '<Your SAS Token, WITHOUT any leading "?">';

You will be prompted, if you haven't already, to create a database master key before you can execute the previous operation.

1CREATE MASTER KEY ENCRYPTION BY PASSWORD = 'My-L0ng&Str0ng_P4ss0wrd!';

After creating the credentials, build an External Data Source and point it to the Azure Blob Account containing your JSON files.

1CREATE EXTERNAL DATA SOURCE MyAzureBlobStorage
2WITH ( TYPE = BLOB_STORAGE,
3 LOCATION = 'https://<storage-account>.blob.core.windows.net/agreements'
4 , CREDENTIAL= MyBlobCredential --> CREDENTIAL is not required if a blob is configured for public (anonymous) access!
5);

After that, you can use OPENROWSET and OPENJSON to begin experimenting with JSON files:

1--insert into temp table
2SELECT
3 CAST(BulkColumn AS NVARCHAR(MAX)) AS JsonData into #TempTable
4 FROM
5 OPENROWSET(BULK <'File_Name.Json'>, DATA_SOURCE = 'MyAzureBlobStorage', SINGLE_CLOB) AS AzureBlob
6
7-- Read and insert into Sql table
8SELECT
9 j.*
10FROM
11 #TempTable t
12CROSS APPLY
13 OPENJSON(t.JsonData) WITH (
14 Id int,
15 StudentName nvarchar(100),
16 Age int,
17 Email nvarchar(100)
18) j

Output

Json_Read_Result

Is it possible to get the code? Absolutely, click me to get the gist of it🙋
1--Create Master Key
2CREATE MASTER KEY ENCRYPTION BY PASSWORD = 'My-L0ng&Str0ng_P4ss0wrd!';
3
4--Drop Sources
5DROP EXTERNAL DATA SOURCE MyAzureBlobStorage
6Drop DATABASE SCOPED CREDENTIAL MyBlobCredential
7Drop table #TempTable
8
9-- Step 1: Create the credential
10CREATE DATABASE SCOPED CREDENTIAL MyBlobCredential
11WITH IDENTITY = 'SHARED ACCESS SIGNATURE',
12SECRET = 'sv=2023-01-03&st=2024-07-30T04%3A35%3A51Z&se=2024-07-31T04%3A35%3A51Z&sr=c&sp=rl&sig=ec%2FTVH0k%2BXTUyS2FY5y9hIg8WfvdQ%2FBfaZF1VzKjo6Y%3D';
13
14
15-- Step 2: Create the external data source
16CREATE EXTERNAL DATA SOURCE MyAzureBlobStorage
17WITH ( TYPE = BLOB_STORAGE,
18 LOCATION = 'https://helioseventstoretest.blob.core.windows.net/agreements'
19 , CREDENTIAL= MyBlobCredential --> CREDENTIAL is not required if a blob is configured for public (anonymous) access!
20);
21
22-- Step 3: Read and store result in temp table
23SELECT
24 CAST(BulkColumn AS NVARCHAR(MAX)) AS JsonData into #TempTable
25 FROM
26 OPENROWSET(BULK 'test.json', DATA_SOURCE = 'MyAzureBlobStorage', SINGLE_CLOB) AS AzureBlob
27
28
29-- Step 4: Inert into Sql table
30INSERT INTO dbo.Students (Id,StudentName,Age,Email)
31SELECT
32 j.*
33FROM
34 #TempTable t
35CROSS APPLY
36 OPENJSON(t.JsonData) WITH (
37 Id int,
38 StudentName nvarchar(100),
39 Age int,
40 Email nvarchar(100)
41) j
42
43-- Step 4: Read from Sql table
44select * from dbo.Students

I hope it was enjoyable for you to read the content. But with these, we can only read the contents of one file at a time. In a later blog post, I'll demonstrate how to use Azure Synapse to read many files at once.