Querying and Manipulating JSON Data with T-SQL

In this guide you will learn how to utilize several JSON built-in functions using an instance of the Northwind dataset running on top of SQL Server 2017 Express

By Gabriel Cánepa

Dec 13, 2019 • 10 Minute Read

Subscribe to the newsletter

Introduction

JSON (short for JavaScript Object Notation) was first introduced in the early 2000s as a lightweight format to exchange data between applications and servers. However, it was not until 2013 that its use became widespread when it was standardized by ECMA International as ECMA-404. A couple of years later—starting with SQL Server 2016—Microsoft added built-in functions that allow developers to manipulate JSON data using T-SQL directly.

You can view the original (first version, dated October 2013) draft of the ECMA-404 standard here. The current version (published on December 2017) is available for download as well.

Among other things, these functions allow you to execute SELECT and UPDATE queries on JSON text and view sets of JSON objects as table records (and vice versa). In addition, the exponential rise of the Internet of Things over the last decade, along with the complexity of data, made it necessary for several companies to adopt NoSQL databases. With SQL Server 2016 and later, you can leverage the best of both worlds (structured and unstructured data) using the same engine.

In this guide you will learn how to utilize several JSON built-in functions using an instance of the Northwind dataset running on top of SQL Server 2017 Express and its associated SQL Server Management Studio version. You can download the database installation script from the Microsoft SQL Server Samples repository and use it under the terms of the MIT license.

Viewing Results of Queries as JSON

To begin, let us start by displaying the result of a SELECT statement in JSON format. Consider the following T-SQL query, which returns the output below:

          SELECT TOP 10 			 
			c.CompanyName, 
			c.City, 
			c.Country,
			COUNT(o.OrderID) AS CountOrders
FROM		Customers c 
JOIN		Orders o 
ON			c.CustomerID = o.CustomerID
GROUP BY	c.CompanyName, c.City, c.Country
ORDER BY	COUNT(o.OrderId) DESC
    

Next, add FOR JSON PATH at the end of the query as shown below and execute it again. The result set now consists of a single row with a link that contains the same dataset but formatted as JSON. Click the link to view it in a separate SQL Server Management Studio tab.

Figure 2 - Viewing the result set as JSON

Unfortunately, SSMS does not provide a native method to display JSON in a friendly manner at the time of this writing. Fortunately, most text editors (Notepad++ or VisualStudio Code, to name two examples) have plugins that can take care of that, as shown below.

As you can see, each record was converted into a JSON object and the result set became an array of such objects. Finally, each individual cell is represented by a specific key-value pair. Alternatively, you can add a root element by replacing FOR JSON PATH with FOR JSON PATH, ROOT('Top10Customers'):

          SELECT TOP 10 			 
			c.CompanyName, 
			c.City, 
			c.Country,
			COUNT(o.OrderID) AS CountOrders
FROM		Customers c 
JOIN		Orders o 
ON			c.CustomerID = o.CustomerID
GROUP BY	c.CompanyName, c.City, c.Country
ORDER BY	COUNT(o.OrderId) DESC
FOR			JSON PATH, ROOT('Top10Customers')
    

This image shows the difference between both queries (with and without a root element):

The decision of whether including a root element or not depends on the needs of the application consuming the result set. In any event, thanks to FOR JSON PATH, you can return JSON data directly so you do not have to serialize the response separately at the application level.

Transforming JSON into Tabular Format

The Employees table in the Northwind database contains a column called Notes with information on each employee including education, interests and previous experience—but it's all text. Let us create a new table where we will take some of the information in this field and convert it into a JSON object that will be easier to manipulate. This will allow us to simulate an unstructured set of attributes. To accomplish this goal, we will do the following:

          CREATE TABLE [Northwind].[dbo].NewestHires (
	EmployeeID	INT PRIMARY KEY,
	FullName	VARCHAR(100)	NOT NULL,
	Title		VARCHAR(50)		NOT NULL,
	HireDate	VARCHAR(50)		NOT NULL,
	Notes		VARCHAR(MAX)	NOT NULL
)
    

Let's insert the newest five hires from the Employees table to a new one called NewestHires. Note that the we will populate Notes with JSON data (which will be empty initially so we can illustrate how to update it later):

          INSERT INTO NewestHires
SELECT TOP 5 
		EmployeeID,
		CONCAT(FirstName, ' ', LastName) AS FullName, 
		Title, 
		HireDate, 
		'{"education": "", "field": "", "bilingual": ""}' AS Notes
FROM	Employees ORDER BY HireDate DESC
    

To view the attributes of each JSON object as columns, we can use the JSON_VALUE function to extract a value from a JSON string (or JSON_QUERY to retrieve an array). The first parameter is the name of the column that contains the object, whereas the second is the actual attribute and the dollar sign $ represents the entire object.

          SELECT 
		FullName,
		Title,
		HireDate,
		JSON_VALUE(Notes, '$.education') AS Education,
		JSON_VALUE(Notes, '$.field') AS Field,
		JSON_VALUE(Notes, '$.bilingual') AS Bilingual 
FROM	NewestHires
    

If the object had a root element called info, the second parameter that is passed to JSON_VALUE should be '$.info.education', '$.info.field', and '$.info.bilingual' respectively.

Updating Key-Value Pairs in a Given Column

To update Steven Buchanan's (EmployeeID = 5) education to include a BA, we can do:

          UPDATE NewestHires SET Notes = 
JSON_MODIFY(Notes, '$.education', 'BA') 
WHERE EmployeeID = 5
    

or to update Michael Suyama's (EmployeeID = 6) education with an MA and an MBA:

          UPDATE NewestHires SET Notes = 
JSON_MODIFY(Notes, '$.education', JSON_QUERY('["BA","MBA"]')) 
WHERE EmployeeID = 6
    

The use of JSON_QUERY above is required to extract, or convert, the regular text inside single quotes into a JSON fragment.

It is also possible to update several attributes at once using as many nested JSON_MODIFY functions as necessary:

          UPDATE NewestHires SET Notes = 
JSON_MODIFY(
	JSON_MODIFY(
		JSON_MODIFY(Notes, '$.education', 'BA'),
	'$.field', 'Psychology'),
'$.bilingual', CAST(1 AS BIT)
)
WHERE EmployeeID = 8
    

Finally, you can even append values to an array attribute. For example, if Michael Suyama receives his PhD, we will need to update his education accordingly (note the use of the keyword append in the second argument that is passed to JSON_MODIFY):

          UPDATE NewestHires SET Notes = 
JSON_MODIFY(Notes, 'append $.education', 'PhD') 
WHERE EmployeeID = 6
    

This image shows the NewestHires table before and after the previous four queries:

Figure 4 - Querying the NewestHires table

Reading JSON Into Rows and Columns

If, instead of having only one JSON column (such as Notes above), we need to parse an entire object into rows and columns, the OPENJSON function will be our ally. To demonstrate, let's consider the following JSON object that we have saved into a variable called @NewEmployees:

          DECLARE @NewEmployees NVARCHAR(MAX);
SET @NewEmployees = N'[
	{
		"EmployeeID": 23,
		"FullName": "James Hulse",
		"Title": "Project Manager",
		"HireDate": "2019-12-04"
	},
	{
		"EmployeeID": 24,
		"FullName": "Sally White",
		"Title": "Sales Supervisor",
		"HireDate": "2019-12-05"
	}
]';
    

To display these JSON objects as rows and columns, we will pass @NewEmployees as argument to OPENJSON. The next steps consists of telling the function what data types should go with each property. To do this, we need to enclose the column names, the desired data types and the corresponding properties inside a WITH clause.

          SELECT *
FROM OPENJSON(@NewEmployees)
WITH (
	EmployeeID INT 'strict $.EmployeeID',
	FullName VARCHAR(100) '$.FullName',
	Title VARCHAR(50) '$.Title',
	HireDate DATETIME '$.HireDate'
);
    

This image shows the results of the above query:

Figure 5 - Displaying JSON objects in rows and columns

In this example, we added the strict prefix in EmployeeID to specify that EmployeeID must exist in the JSON object (it makes sense if this field will be a primary key). It is important to note that we do not necessarily have to return all the components of the JSON objects (to leave one or more out, simply omit them in the WITH clause). Also, OPENJSON can handle both flat structures or hierarchical objects with one or more root elements (what we said earlier about JSON_VALUE applies here as well).

Conclusion

In this guide we have learned how to query JSON data using T-SQL queries and how to present sets of records as JSON objects (and the other way around). These skills will help you simplify API development and allow you to implement robust structured and unstructured data models. With JSON, you do not need large numbers of relational tables or additional columns for each new attribute that you need to represent.

Gabriel C.

Written Content Author

More about this author