
ChatGPT is all the rage right now, and understandably so. Feeding it prompts and watching it spit out quality articles, recipes, travel itineraries, code examples, and more is fun and addictive. Yep, it’s a rabbit hole, and one we’re all getting a kick out of.
One of ChatGPT’s many neat tricks is its ability to generate SQL statements; you simply tell ChatGPT about your tables, verify that it understands the structure properly, and then ask for what you’d like returned, all in simple English. It works, and really well. And so, in this post I’ll demonstrate how to take advantage of this magical capability using tables (well, a subset of the tables) and examples from my Sams Teach Yourself SQL in 10 Minutes.
Ok, here goes. Obviously, you’ll need to log in to ChatGPT (creating an account if you’ve not yet done so).
The first thing you need to do is tell ChatGPT about your tables. Here’s the prompt I used:
Customers table contains columns cust_id, cust_name. Orders table contains columns order_num, order_date, cust_id. Vendors table contains columns vend_id, vend_name. Products table contains columns prod_id, vend_id, prod_name, prod_price. OrderItems table contains columns order_num, prod_id, quantity, item_price.
As you can see, I simply told ChatGPT the table names and the columns in each. (There are actually more columns in my tables; for brevity’s sake I opted to just list the ones needed here). So, what did ChatGPT do with this info? This is how it responded:
It appears you are providing the structure of multiple database tables in a relational database, possibly SQL. These tables are Customers, Orders, Vendors, Products, and OrderItems.* Customerstable: It has customer information, each identified by a uniquecust_idand associated with acust_name.* Orderstable: It records individual orders, each associated with a uniqueorder_num,order_date, and linked to a customer throughcust_id.* Vendorstable: It holds vendor information, each identified by a uniquevend_idand avend_name.* Productstable: It has product information. Each product has a uniqueprod_id, aprod_name, and aprod_price. Each product is linked to a vendor throughvend_id.
*OrderItemstable: This table details the items in each order, identified byorder_numandprod_id. It also has thequantityof each product and theitem_pricein the order.
Please let me know what specific query or operation you would like to perform on this database structure.
Ok. Take a deep breath, and … read that response. Seriously, that is mind-boggling! Not only did ChatGPT understand the tables I enumerated, but it also figured out all the table relationships. It worked out how Orders relate to Customers, and Products to OrderItems, and so on. I didn’t define any of them, it worked it which columns in which tables were related to others. It helps that the column name conventions I used were consistent, but still, wow!
Now to see how well ChatGPT generates SQL. This was my next prompt:
Return a list of customers with a count of orders for each.
ChatGPT worked for a few seconds and returned this:
To return a list of customers along with a count of their orders, you could perform a SQL query that JOINs the Customers and Orders tables on the cust_id field, and then groups by the customer name or ID while counting the number of orders. Here's an example of how you might do this:
SELECT C.cust_id, C.cust_name, COUNT(O.order_num) AS order_count FROM Customers C LEFT JOIN Orders O ON C.cust_id = O.cust_id GROUP BY C.cust_id, C.cust_name ORDER BY order_count DESC;
This query works by joining the Customers and Orders tables together where thecust_idfields match. For each customer, it counts the number of associated orders (order_numentries). TheGROUP BYclause groups the results by bothcust_idandcust_nameto provide a count for each customer. TheORDER BYclause sorts the result by the count of orders in descending order so customers with the most orders come first. Please note that if there are customers who haven't made any orders, they will also appear in the list with an order count of 0, because of theLEFT JOINoperation. If you only want customers who have made at least one order, you could use anINNER JOINinstead.
Well, yeah, that worked, ChatGPT returned a SELECT statement that correctly joins the two needed tables so as to return customers and an order count for each. I love that it also added a note explaining that it has used a LEFT JOIN in order to include customers with 0 orders, and to change that to an INNER JOIN if that is not what is desired.
Impressive, right? But what about DBMS specific SQL? The best way to try that is to use dates (date manipulation functions tend to be notoriously DBMS specific). Here’s the prompt I used:
Select customers who have made an order in the past 90 days with total order amount.
This seems simple, right? Nope! There is actually no way to generate this SQL for every major DBMS, the functions needed to calculate 90 days from a specific date really are very DBMS specific. Here’s what ChatGPT returned:
Sure, you can create a query that joins theCustomers,Orders, andOrderItemstables to calculate the total order amount for each customer in the past 90 days. Here is an example SQL query:
SELECT
C.cust_id,
C.cust_name,
SUM(OI.item_price * OI.quantity) AS total_order_amount
FROM
Customers C
INNER JOIN
Orders O ON C.cust_id = O.cust_id
INNER JOIN
OrderItems OI ON O.order_num = OI.order_num
WHERE
O.order_date >= CURDATE() - INTERVAL 90 DAY
GROUP BY
C.cust_id,
C.cust_name
HAVING
total_order_amount > 0
ORDER BY
total_order_amount DESC;
This query starts by joining theCustomers,Orders, andOrderItemstables together. It uses theWHEREclause to filter out any orders that weren't placed in the last 90 days. It then calculates the total order amount for each customer by multiplying the item price by the quantity (OI.item_price * OI.quantity) and summing these values. Finally, it uses theHAVINGclause to include only customers who have a total order amount greater than 0, and orders the results by the total order amount in descending order. Please note that the date comparison in theWHEREclause might slightly vary depending on the specific SQL dialect used. The query assumes theorder_datecolumn is of a date type that can be directly compared withCURDATE() - INTERVAL 90 DAY.
ChatGPT returns a valid response, and notes that the WHERE clause might slightly vary depending on the specific SQL dialect used. Yeah, no kidding! So, what DBMS will this work with? Let’s ask ChatGPT:
What DBMS can I use with this SQL statement?
ChatGPT responds with:
The SQL statement before the SQL Server-specific one is written in a fairly standard SQL dialect and should work on several different database management systems (DBMS), with potentially minor modifications due to slight differences in SQL syntax and function naming. Some of these DBMS include: * MySQL: It's probably the most compatible with this statement, as the original statement appears to be written in a MySQL-friendly syntax. * PostgreSQL: This DBMS also uses standard SQL syntax similar to that used in the statement. However, the CURDATE() function should be replaced with CURRENT_DATE. * MariaDB: As a fork of MySQL, MariaDB maintains a high level of compatibility with MySQL's SQL syntax. * SQLite: While SQLite uses a variant of SQL, the query might require minor modifications. For instance, SQLite does not support the INTERVAL keyword, so the date subtraction part would need to be adjusted. Please note that Oracle, SQL Server, and some other DBMS might require more significant modifications due to differences in function naming and SQL syntax.
Turns out ChatGPT generated syntax that would work with MySQL, a reasonable place to start, and also noted the tweaks needed use these statements with PostgreSQL and SQLite. And, yes, it also noted that Oracle, SQL Server, and some other DBMS might require more significant modifications. Ok, so let’s ask ChatGPT to do just that:
Give me a version of that SELECT statement that will work with Microsoft SQL Server
And the response is:
Microsoft SQL Server has some differences in syntax and function naming compared to other SQL dialects. Here's how you might write the same query in SQL Server:
SELECT
C.cust_id,
C.cust_name,
SUM(OI.item_price * OI.quantity) AS total_order_amount
FROM
Customers C
INNER JOIN
Orders O ON C.cust_id = O.cust_id
INNER JOIN
OrderItems OI ON O.order_num = OI.order_num
WHERE
O.order_date >= DATEADD(day, -90, GETDATE())
GROUP BY
C.cust_id,
C.cust_name
HAVING
SUM(OI.item_price * OI.quantity) > 0
ORDER BY
total_order_amount DESC;
TheGETDATE()function in SQL Server returns the current date and time, andDATEADD(day, -90, GETDATE())subtracts 90 days from the current date. This replacedCURDATE() - INTERVAL 90 DAYfrom the original query.
Another change is in theHAVINGclause. SQL Server requires the condition in theHAVINGclause to contain an aggregate function, it does not allow aliases to be used as they are not yet defined at the time when the clause is processed. Thus, I replaced thetotal_order_amountalias with the expression it stands for:SUM(OI.item_price * OI.quantity).
You get the idea. ChatGPT does a remarkable job generating SQL so long as:
- Start by defining your tables. Use plain English, the less fussy the better.
- Verify that ChatGPT has correctly understood your table structure. I it hasn’t (this can happen if column names are less consistent) then add a prompt explaining the required table relationships.
- Then prompt for the SQL statements by clearly defining the desired output (in English, not SQL).
- It’s best to tell ChatGPT what DBMS you are using.
And there you have it, SQL generation powered by ChatGPT!
Oh, BTW, that image at the top left, the robot database thingy? That’s also generated by an AI, specifically Adobe Firefly. Seemed fitting for this post.

Leave a Reply