Art Level SQL: goyesql

Art Level SQL: goyesql
Photo by Pineapple Supply Co. / Unsplash

One of the never-ending debates in software development is the use of ORMs vs. query builders vs. writing raw SQL. An art-level approach to managing SQL queries is hiding in plain sight, and I am here to share it.

ORM vs. Query Builder vs. Raw SQL

Before we go into detail, it is important to discuss alternative methods and their implications. We should even identify the root cause of this debate.

The database is an external service from the application perspective. Two services interact with each other under a contract, the API interface. For example, in a RESTful application it is safe to assume GET /book/:bookId returns a book in a predefined format and DELETE /book/:bookId will delete that book.

Databases support many operations and their combinations on user-defined schemas. This results in endless possible SQL statements, and endless possible query results. Moreover, you can't even validate the correctness of an SQL statement unless it is sent to the database.

🦀
https://github.com/launchbadge/sqlx features compile time checked queries. Which means queries are validated on a live database during compilation.

What does ORMs suggest?

ORMs suggest that database queries should be built by the application code. It means every operation SQL can perform need to be expressable by the application code.

Many people oppose the idea simply because this approach leads to a concept called Leaky Abstraction. An abstraction should hide underlying implementation details. It is not possible for an ORM to hide some SQL methods because that would limit the application's ability to interact with the database.

The ORMs also try to support multiple SQL paradigms, resulting in even leakier abstractions.

💡
Leaky Abstraction: An abstraction layer that can't abstract/hide the underlying implementation details from user space.

What does Query Builders suggest?

The name is self-explanatory, but I felt the need to explain. They allow you to write SQL statements, they don't hide that you are working with SQL. At the end of the day, an SQL statement is a string and it can't be type checked by your beloved programming language.

Query builders try to close that gap and help you build SQL statements using programming language constructs.


In short, some developers are uncomfortable with SQL's flexibility and they want to tone it down a bit. I think the real uncomfortable idea is that we can never settle these debates, as per the saying: "it depends".

A Better Way: goyesql

With goyesql you can write queries on a file and map them to sql statements in the application code. An example file:

-- queries.sql --

-- name: insert-book
INSERT INTO books (title) VALUES ($1);

-- name: get-books
-- Get all books
SELECT * FROM books;

-- name: get-book-by-id
-- Get book by id
SELECT * FROM books WHERE id = $1;

In SQL files, you can write comments using –- . goyesql takes advantage of this concept and maps SQL statements with their respective names.

🤷‍♂️
I will consistently refer to goyesql to prevent ambiguity, however it is not the first or the last to use this concept. It is even forked from another package. The same concept appears in many packages across languages, however I couldn't find a common name that describes the concept. Some names I encountered: yesql, dotsql, hugsql, puresql.

The idea is to use raw SQL statements, but keep them outside the application code. It is a beautiful way to apply separation of concerns. In application code you can access those queries as:

package main

import (
  "database/sql"
  "fmt"
  "os"

  "github.com/knadh/goyesql/v2"
  _ "github.com/lib/pq"
)

type Queries struct {
  InsertBook  *sql.Stmt `query:"insert-book"`
  GetBooks    *sql.Stmt `query:"get-books"`
  GetBookById *sql.Stmt `query:"get-book-by-id"`
}

type Book struct {
  ID    int64  `db:"id"`
  Title string `db:"title"`
}

func main() {
  db := ConnectDBAndRunMigrations()
  queries := goyesql.MustParseFile("queries.sql")

  var bookQueries Queries

  err := goyesql.ScanToStruct(&bookQueries, queries, db)
  if err != nil {
    panic(err)
  }

  // Insert 10 books to populate db
  for i := 0; i < 10; i++ {
    title := fmt.Sprintf("my book #%v", i)
    _, err = bookQueries.InsertBook.Exec(title)
    if err != nil {
      panic(err)
    }
  }

  row := bookQueries.GetBookById.QueryRow(5)
  if err = row.Err(); err != nil {
    panic(err)
  }

  var book Book
  err = row.Scan(&book.ID, &book.Title)
  if err != nil {
    panic(err)
  }

  fmt.Printf("Book ID: %v, Book Title: %v\n", book.ID, book.Title)
  // Book ID: 5, Book Title: my book #4
}

Step by Step Introduction

First start a go project:

go mod init example

Then add the dependencies:

go get -u github.com/knadh/goyesql/v2 github.com/lib/pq

lib/pq is used to interact with PostgreSQL but you can use whichever database you want.

Create a file queries.sql and write the following queries inside:

-- queries.sql --

-- name: insert-book
INSERT INTO books (title) VALUES ($1);

-- name: get-books
-- Get all books
SELECT * FROM books;

-- name: get-book-by-id
-- Get book by id
SELECT * FROM books WHERE id = $1;

You also need to create the tables in the database, so create another file schema.sql

DROP TABLE IF EXISTS books CASCADE;
CREATE TABLE books
(
  id    SERIAL PRIMARY KEY,
  title VARCHAR(255)
);

These are all the sql statements we need. Now, create a Go file main.go Make sure the package is main and you have a main function.

package main

func main(){
	
}

First we need to establish a database connection and run our migrations. To keep our main function simple, lets create another function ConnectDBAndRunMigrations for this purpose.

func ConnectDBAndRunMigrations() *sql.DB {
  psqlInfo := fmt.Sprintf("host=%s port=%d user=%s "+
    "password=%s dbname=%s sslmode=disable",
    host, port, user, password, dbname)
  db, err := sql.Open("postgres", psqlInfo)
  if err != nil {
    panic(err)
  }
  err = db.Ping()
  if err != nil {
    panic(err)
  }

  // Execute migrations from schema.sql
  schema, _ := os.ReadFile("schema.sql")
  _, err = db.Exec(string(schema))
  if err != nil {
    panic(err)
  }

  return db
}

To keep our example simple, we can panic on any error. Also don't forget to define connection parameters in the code:

const (
  host     = "localhost"
  port     = 5432
  user     = "user"
  password = "1234"
  dbname   = "example"
)

If you don't have a database running, you can start one with the following docker-compose.yml:

version: '3'

services:
  postgres:
    image: postgres:14-alpine
    ports:
      - "5432:5432"
    environment:
      - POSTGRES_PASSWORD=1234
      - POSTGRES_USER=user
      - POSTGRES_DB=example

You can create a docker-compose.yml and run

docker compose up

to start a local PostgreSQL instance.

Now our main function should look like this:

package main

func main(){
  db := ConnectDBAndRunMigrations()
}

// ...

Now we can focus on reading the queries.sql into SQL statements in the application code. goyesql has a convenient method to do just that:

package main

func main(){
  db := ConnectDBAndRunMigrations()

  // The code that parses SQL file to a map[string]*Query
  queries := goyesql.MustParseFile("queries.sql")

  // Queries can be accessed by:
  fmt.Println(queries["get-book-by-id"].Query)
  // => SELECT * FROM books WHERE id = $1;
}

// ...

You can stop here and use goyesql like the above example. However, we can awake its true powers by mapping those queries to struct properties. For that we need to create a struct and tag its properties with the query names as such:

type Queries struct {
  InsertBook  *sql.Stmt `query:"insert-book"`
  GetBooks    *sql.Stmt `query:"get-books"`
  GetBookById *sql.Stmt `query:"get-book-by-id"`
}

While we are at it, lets also define the Book struct:

type Book struct {
  ID    int64  `db:"id"`
  Title string `db:"title"`
}

Now we can scan the queries.sql and map those queries to struct fields:

package main

// ... struct definitions

func main(){
  db := ConnectDBAndRunMigrations()

  // The code that parses SQL file to a map[string]*Query
  queries := goyesql.MustParseFile("queries.sql")

  var bookQueries Queries
  err := goyesql.ScanToStruct(&bookQueries, queries, db)
  if err != nil {
    panic(err)
  }
}

// ...

Once we have succesfully created the Queries instance, it is very simple to use.

Example call to get a book by its id using the prepared statement in queries.sql.

// ...
row := bookQueries.GetBookById.QueryRow(5)
if err = row.Err(); err != nil {
  panic(err)
}

var book Book
err = row.Scan(&book.ID, &book.Title)
if err != nil {
  panic(err)
}
// ...

Example call to insert a book:

// ...
title := fmt.Sprintf("my book #%v", i)
_, err = bookQueries.InsertBook.Exec(title)
if err != nil {
  panic(err)
}
// ...
💡
Instead of sql you want to use sqlx simply use methods from "github.com/knadh/goyesql/v2/sqlx" and you are good to go.

What is Beautiful About This Approach?

  1. All the queries are visible in a central place.
  2. Raw SQL queries without cluttering the application code.
  3. Very simple abstraction that can be understood without prior knowledge of the technique.

All three of the advantages are very important in a project. Each serve the same vision: simplicity. I want to explain each of them in a little more detail.

Central Place for Queries

This is great for debugging / inspecting database queries. Otherwise, you would dive into the application code and go through all the repository files.

This approach doesn't enforce you to write SQL queries in a single queries.sql file. However, I don't see any reason to split them apart. Instead, you can think of it the other way around: this method makes it very convenient to write all the queries in a single file.

SQL Doesn't Clutter the Application Code

Whether you use ORMs or query builders or inline SQL queries, those all look ugly inside the application code. They become a source of distraction and fear. With this approach you don't need to read a single line of SQL inside the application code.

No Prior Knowledge Required - To Understand

There is nothing better than a self-explanatory abstraction. Even if you've never heard of goyesql before, you can immediately realize what is going on in the code base. This is contrary to ORMs and query builders, which require prior knowledge of that specific library.

Even your future self can understand what is going on. Don't hate your future self, love them.

Proof This Approach Works in Big Projects

As a self taught developer, I find it very educating to inspect open source projects and learn from more experienced developers' techniques. This is an approach I discovered during my inspection of Listmonk.

I am pretty sure you are not convinced of something until you see it in action, you are a developer after all. So, don't take my word for it, go checkout their repository and see how it makes things much simpler compared to other methods.

Conclusion

Apart from ORMs, query builders or raw SQL, there is another novel approach to manage SQL queries. It doesn't have a definitive name but in this post we refer to it as goyesql. It has 3 main advantages:

  1. Easily managable single file for SQL queries
  2. Writing raw SQL statements without cluttering the application code
  3. A simple method that requires no prior knowledge.

These are also the reason why this method is not talked a lot. Usually complex solutions bring a lot of questions with them, and they are talked a lot more compared to simpler solutions. ( complexity bias? )

In this post, I wanted to give a spotlight to this simple solution. Hope you enjoyed it!