Kotlin introduced Coroutines for asynchronous programming and more. This article covers why we need Kotlin Coroutines, it's fundamental building blocks in theory, and then shows them in action (code) to explain the blocks clearly.

The code shown is available on Github as well.

Why do we need coroutines?

Apps perform various tasks. For each of this task, operation time varies. While doing these time varying tasks, we need to ensure user experience isn't degraded. One way to do that (while also reducing bottlenecks) is to do asynchronous operations.

Let’s discuss potential solutions to do this and their potential problems:

  1. Threading  –  It is a great solution. But managing them can quickly get out of hand.
  2. Callbacks  –  Simple, obvious and easy to implement but can quickly degrades down to callback hell. 
  3. Futures and promises  –  Personally I haven't used them a lot, but I find them not powerful enough for my use-cases. 
  4. Reactive (Rx) streams  –  Powerful solution but introduces drastic change in way of thinking. 
  5. Coroutines  –  Elegant solution. Offers a small learning curve in comparison to other solutions, minimal change to code and looks the same as synchronous code.

Let’s begin our understanding of coroutines.

Coroutines are suspendible execution blocks

Don't worry. I'll explain these keywords later. In simple words - they are lightweight threads so you can create millions of them and they will run and execute on a shared pool of threads easily.

Note: They aren't bundled with the Kotlin langauge but is instead packaged as a library by Jetbrains. So from here on when I say library, I mean the coroutine library.

Basic keywords and building blocks

Photo by Glen Carrie on Unsplash

The best way to explain all the required keywords is through an example. So let’s take one. Suppose we have a task T in which we need to combine values from 5 other long running child tasks - T1, T2, T3, T4, T5.

T is a fire and forget kind of task  –  the invoking page doesn't care about it's result and the task must be completed even if the page closes. T1, T2, T3 are network calls, T2 depends on T1 and T3 on T2. T4 is a file read and T5 is a database entry operation which depends on T4. We'll call the coroutine used to do the task T  as  C and C1, C2, C3, C4, C5 for child tasks respectively.

Now before we dive into the code, we need to learn a little bit of the keywords and building blocks, otherwise the code can get a little overwhelming. So let's go over them quickly once and we'll re-cover this in the code again.

To understand the following clearly, remember:
1. Coroutines are like light-weight threads mapped to actual threads being executed.
2. Coroutines can launch other child coroutines and wait on their results.

Dispatcher

  • Encapsulates the thread or the thread pool on which coroutine will execute.
  • For e.g. For coroutine C, we want it start on a background thread dedicated to IO operations. For C5 we'll want a DB thread.
  • Some dispatchers are already provided by the coroutine library:
    Main (thread) , IO (thread pool) , Default (thread pool) etc.
  • We can make out own dispatcher by wrapping a single thread or thread pool.

Coroutine job

  • Represents a single coroutine task e.g. C or C1 etc.
  • It has a lifecycle with states (Imitating thread states)
  • States : New, Active, Completing, Cancelling, Cancelled, Completed
  • Any parent job like will only complete if all it's child jobs are completed (or cancelled).
  • C can only completed or be cancelled if all C1, C2, C3, C4, C5 are completed or cancelled.
  • This imitates the parent-child relationship of process in OS world.

Coroutine start

  • Tells the library whether we want to start the coroutine as soon as we declare or call it or lazily when asked to later.

Coroutine context

  • A key value map of stuff most important for a coroutine like it’s dispatcher, the scope it’s running in, it’s start object, it’s job etc.
  • When a child coroutine is started inside a parent one, it mixes it's own context (usually empty coroutine context) with the one of the parent
  • For e.g. C5 when launched will mix C's context with whatever it's own context is.
  • Suppose C was running on Dispatchers.IO but C5 has to run on DB threads. So we'll pass CustomDispathers.DB to C5 when launching. The libary will mix the contexts and the CustomDispathers.DB will take priority over Dispatchers.IO

Coroutine scope

  • Scope in which a coroutine runs (similar to variable scope). When the scope ends, so does the coroutine.
  • When a child coroutine is started inside a parent one it inherits parent scope (Unless specified otherwise). It means that when parent coroutine is stopped, so will the child coroutine since it’s scope was that of the parent.
  • Global scope is provided by the library and lives throughout until the app is living. Coroutines launched on global scope run like daemon threads, they can't stop the app from exiting.
  • In android, there are ways to create scope that can be tied to activity, fragment or manual management.

Coroutine builders

  • Builders provided by library to build coroutines around coroutine scopes.
  • launch, async, runBlocking are some of the builders given by library
  • They can add their own context to child builders launched inside.
  • launch is used to do fire and forget things. It returns immediately when called returning the coroutine job.
  • async also returns immediately and it returns a special kind of job called Deferred which can be cancelled or it's result can be awaited.
  • runBlocking is used to run tasks by blocking whichever thread it's called on until the block completes.
  • We'll look into example soon to understand these builders better.
val job = launch {
	// fire and forget task is done here
}

val deferredJob = async<Int> {
	// a job which returns some result
}
// Block until deferredJob returns with the result
val result = deferredJob.await()

runBlocking {
	// Block the calling thread until this block execution isn't complete
}

suspend keyword

  • suspend keyword is applicable to functions.
  • suspend means it may suspend the execution of the current coroutine i.e. saving the function stack frame and execute this function now. It doesn't necessarily mean that the coroutine be changed.
  • Let's take an example:
fun main() {
    runBlocking {
        val coroutineX = GlobalScope.launch {
            val result1 = n1(5)
            println("Hello I'll print after n1 is over")
            println("Result1: $result1")
            val result2 = n2(5)
            println("Hello I'll print after n2 is over")
            println("Result2: $result2")
        }
        coroutineX.join()
    }
}

suspend fun n1(a: Int): Int {
    delay(500)
    return a
}

suspend fun n2(a: Int): Int {
    val coroutineY = GlobalScope.async (Dispatchers.IO) {
        // same as Thread.sleep() but for coroutines 
        delay(500)  
        return@async a*a
    }
    return coroutineY.await()
}

/* 
OUTPUT:
Hello I'll print after n1 is over
Result1: 5
Hello I'll print after n2 is over
Result2: 25
*/

Explanation:
Here as n1() is called inside coroutineX, it’s same as calling a normal f().

But when n2() is called from coroutineX, since n2() launches coroutineY, the calling coroutine, coroutineX is suspended and the thread it was executing on is returned to it's Dispatchers.Default. Here the f() stack of coroutineX is saved.

When n2() returns the result then coroutineX is resumed from the stack and it executes on either the same or a new thread from Dispatchers.Default thread pool.

Note: Dispatchers gaurentee that the coroutine will run on a thread from the pool when they resume their suspension but doesn't guarentee same thread execution.  Usually we don't care but if one has the use case, coroutines do provide ways to share thread local data between execution on different threads on a Dispatcher.

delay keyword

  • It is like sleep for threads. It’s a suspend function which suspends the calling coroutine for the time period mentioned.

Enough talk – Show me the code

Here is the solution for the task T using coroutines. The analysis is done in comments in the code.

We'll look what task we are doing, the coroutine we are launching, the thread we are on, whether the operation will suspend the calling function, the scope we use, why we choose a particular coroutine builder, coroutine start context item and how jobs are represented and their end points.
Alright, let's see the code.

Let's see the f() for T1, T2 and T3 first.

suspend fun t1(): Int {
    println("Inside t1() on ${Thread.currentThread().name}")
    // Emulate network call
    delay(100)
    return 1
}

suspend fun t2(param1: Int): String {
    println("Inside t2() on ${Thread.currentThread().name}")
    // Emulate network call
    delay(200)
    return "${param1}_".plus("resultT2")
}

suspend fun t3(param2: String): String {
    println("Inside t3() on ${Thread.currentThread().name}")
    // Emulate network call
    delay(200)
    return param2.plus("_").plus("resultT3")
}

Now we see f() for T4 and T5. Here notice the use of custom dispatcher and async builder.

suspend fun t4(): String {
    println("Inside t4() on ${Thread.currentThread().name}")
    try {
        // Emulate reading a file
        delay(100)
        return "resultT4"
    } catch (e: Exception) {
        // Ignore exception handling for now
        return ""
    }
}

// A way to make custom dispatcher using Executors
object CustomDispatchers {
    val DB = Executors.newSingleThreadExecutor().asCoroutineDispatcher()
}

// Scope is passed so we can launch our db operation while respecting the scope
suspend fun t5(scope: CoroutineScope, param4: String): String {

    // Since this is a db entry operation we run this block 
    // using the DB dispatcher we made.
    return scope.async(CustomDispatchers.DB) {
        println("Inside t5() on ${Thread.currentThread().name}")
        try {
            // Emulate db entry operation
            delay(100)
            return@async param4.plus("_").plus("resultT5")
        } catch (e: Exception) {
            // Ignore exception handling for now
            return@async ""
        }
    }.await()
}


Why async builder was chosen for T5? 
I'll explain in the following code but see if you can piece it together as well.

Now we see how the creation of coroutine C.
We launch in global scope to keep the example simple.


/**
 * @return a job to represent the coroutine for 'T'
 * */
fun taskT(): Job {
    /*
    * Starting task 'T' by launching coroutine 'C'
    *
    * Task name: T
    * Coroutine name: C
    * Thread pool: IO is used since we do network calls etc.
    * Scope: We use global scope since the operation is to be completed 
    *        even if the page closes.
    * Builder explanation: launch builder is used since it is a fire and 
    *        forget task the calling page doesn't care about it's result.
    * suspension: No this won't block the main thread but simply start 'C'.
    *
    * We return the representative job.
    * */
    return GlobalScope.launch(Dispatchers.IO) {
        // Logic binding code for t1, t2, t3, t4 and t5 is omitted here
    }
}

Now we have to do the tasks one by one. Check the code comments below for detailed explanation on what was done and why.

fun taskT(): Job {
    return GlobalScope.launch(Dispatchers.IO) {

        println("Starting C on ${Thread.currentThread().name}\n")

        /*
        * Next we need to do task 'T1' which is a network call. So we call t1()
        *
        * suspension: Since t1() is marked suspend, it'll suspend 
        *    the calling function. But since the code of 't1' isn't 
        *    launching a new coroutine, the same coroutine will be 
        *    used to execute t1(). Meaning we continue as if it's 
        *    just a synchronous function and the calling function
        *    isn't executing any further i.e. only when t1 returns 
        *    with the result will T execute any further.
        * Task name: T1
        * Coroutine name: C1 which is just C
        * Thread pool: Still IO as we are on 'C'
        *
        * We make the network call and then return the result.
        * */

        val resultT1 = t1()

        // Further code omitted 
    }
}
fun taskT(): Job {
    return GlobalScope.launch(Dispatchers.IO) {
        println("Starting C on ${Thread.currentThread().name}\n")
        val resultT1 = t1()

        /*
        * Next we need to do task 'T2' whose output is input to task 'T3' 
        * both of which are a network call.
        *
        * Builder explanation: We need the result from these two task for our 
        *        final result. So we'll use async builder. It'll return a deferred
        *        object which we store in variable 'jobT2T3'.
        *        Deferred is a job whose value one can await.
        * Task name: T2, T3
        * Coroutine name: C2, C3 respectively
        * Thread pool: Since nothing is mentioned, while mixing context the parent 
        *        dispatcher is used i.e. IO.
        * suspension: async is started as soon as it's called but it won't 
        *        suspend the calling function. 
        *        Async or launch will also not suspend the calling coroutine 
        *        since the former returns a deferred and the latter is fire 
        *        and forget type.
        * scope: As a child if nothing is mentioned it inherits the scope of 
        *        the parent. So this job will also run in global scope.
        * */
        println("Starting task T2,T3 on ${Thread.currentThread().name}")
        val jobT2T3 = async {
            println("Inside Task T2,T3 on ${Thread.currentThread().name}")
            // t2 and t3 are suspend functions and will suspend this block until it 
            // returns the result
            val resultT2 = t2(resultT1)
            val resultT3 = t3(resultT2)
            return@async resultT3
        }
      
        // omitted code here
    }
}
fun taskT(): Job {
    return GlobalScope.launch(Dispatchers.IO) {
        println("Starting C on ${Thread.currentThread().name}\n")
        val resultT1 = t1()
        println("Starting task T2,T3 on ${Thread.currentThread().name}")
        val jobT2T3 = async {
            val resultT2 = t2(resultT1)
            val resultT3 = t3(resultT2)
            return@async resultT3
        }
        /*
        * Since task T4, T5 aren't dependant on task T2 or T3 using async 
        * for T2,T3 is a win for us since we can continue our execution.
        * Task 'T4' output is input to task 'T5'.
        *
        * Builder explanation: async is used as we need result from these 
        *      two tasks for our final result
        * Task name: T4, T5
        * Coroutine name: C4, C5 respectively
        * Thread pool: Since T4 is a file read and T5 we database write operation, 
        *      we use default thread pool for our combined coroutine
        * suspension: It won't suspend the calling function.
        * start context item : To demonstrate start keyword, we pass it as lazy 
        *      - meaning only start this async coroutine when await() is called.
        *
        * */
        println("Starting task T4,T5 on ${Thread.currentThread().name}\n")
        val jobT4T5 = async(Dispatchers.Default, start = CoroutineStart.LAZY) {
            println("Inside Task T4,T5 on ${Thread.currentThread().name}")
            val resultT4 = t4()
            val resultT5 = t5(scope = this, param4 = resultT4)
            return@async resultT5
        }
        // omitted code here
    }
}

Until now we have started all our tasks inside C. Now we have to combine the result of all tasks.

fun taskT(): Job {
    return GlobalScope.launch(Dispatchers.IO) {
        println("Starting C on ${Thread.currentThread().name}\n")
        val resultT1 = t1()
        println("Starting task T2,T3 on ${Thread.currentThread().name}")
        val jobT2T3 = async {
            val resultT2 = t2(resultT1)
            val resultT3 = t3(resultT2)
            return@async resultT3
        }
        println("Starting task T4,T5 on ${Thread.currentThread().name}\n")
        val jobT4T5 = async(Dispatchers.Default, start = CoroutineStart.LAZY) {
            println("Inside Task T4,T5 on ${Thread.currentThread().name}")
            val resultT4 = t4()
            val resultT5 = t5(scope = this, param4 = resultT4)
            return@async resultT5
        }
        // delay so that task T2,T3 starts and T4,T5 doesn't,
        // to demonstrate use of CoroutineStart.
        delay(100)
        println("Going to await result of task T2,T3 and T4,T5")

        // Now we await the result of task T2,T3 and T4,T5
        // and then combine their result.

        // await() is a suspend function which either
        // suspends the execution of calling function
        // if the deferred doesn't hold the value or
        // will return the result instantly.
        // Suspension: Will suspend the calling coroutine
        //
        // Note: Here it may so happen that jobT2T3 await
        // gives the result back and then we call await on
        // jopT4T5 and it'll return the result immediately
        // since it already computed it while t2 and t3 were executing.
        // This is an example where calling a suspend function doesn't
        // result in a suspension even though we were launching coroutines
        // inside it because the result was already calculated
        // by the time await was called.
        val combinedResult = jobT2T3.await() + "_" + jobT4T5.await()

        // After the await() calls for both tasks are done,
        // the jobs are now in the completed state.
        println("\nThe final result is $combinedResult")
    }
}

Here is the final code for task T

fun taskT(): Job {
    return GlobalScope.launch(Dispatchers.IO) {
        println("Starting C on ${Thread.currentThread().name}\n")
        val resultT1 = t1()
        println("Starting task T2,T3 on ${Thread.currentThread().name}")
        val jobT2T3 = async {
            println("Inside Task T2,T3 on ${Thread.currentThread().name}")
            // t2 and t3 are suspend functions and will suspend this block 
            // until it returns the result
            val resultT2 = t2(resultT1)
            val resultT3 = t3(resultT2)
            return@async resultT3
        }
        println("Starting task T4,T5 on ${Thread.currentThread().name}\n")
        val jobT4T5 = async(Dispatchers.Default, start = CoroutineStart.LAZY) {
            println("Inside Task T4,T5 on ${Thread.currentThread().name}")
            val resultT4 = t4()
            val resultT5 = t5(scope = this, param4 = resultT4)
            return@async resultT5
        }

        delay(100)
        println("Going to await result of task T2,T3 and T4,T5")
        val combinedResult = jobT2T3.await() + "_" + jobT4T5.await()

        println("\nThe final result is $combinedResult")
    }
}

Here is the main() calling T and waiting for it's result.

fun main() {
    // Thread: Main
    // Builder explanation: runBlocking is used for bridging coroutine 
    //      code to normal code.
    //      It blocks the thread calling it until it's execution is complete.
    //      We call our taskT() on main thread and launch coroutines inside it.
    //      It returns the coroutine job as result.
    runBlocking {
        val parentCoroutineC = taskT()

        // Join has to be used for example purposes otherwise the program 
        // will complete before 'T' completes.
        // 
        // This is because since 'T' is a fire and forget task, we start it 
        // with launch coroutine builder which will return instantly with 
        // the representative job 'C' and the execution of this 
        // block will be complete.
        // Also because since T is launched with a global scope, th
        // So to wait for 'C' to complete we wait until it joins.
        
        // Also note that since we start 'C' in GlobalScope 
        // it's not a child of the scope provided by the runBlocking. 
        // Hence the parent-child relationship doesn't apply.
        parentCoroutineC.join()
    }
    println("Exiting main()")
    exitProcess(0)
}

Running the above code will give you a result matching to the following :

Starting C on DefaultDispatcher-worker-1
Inside t1() on DefaultDispatcher-worker-1
Starting task T2,T3 on DefaultDispatcher-worker-1
Starting task T4,T5 on DefaultDispatcher-worker-1
Inside Task T2,T3 on DefaultDispatcher-worker-3
Inside t2() on DefaultDispatcher-worker-3
Going to await result of task T2,T3 and T4,T5
Inside t3() on DefaultDispatcher-worker-1
Inside Task T4,T5 on DefaultDispatcher-worker-4
Inside t4() on DefaultDispatcher-worker-4
Inside t5() on pool-1-thread-1
The final result is 1_resultT2_resultT3_resultT4_resultT5

(Note: Task t4, t5 doesn't start until we call await on it. Thats because we used lazy coroutine start on it)


Now you should have an understanding of all the basic keywords and building blocks.

The line Coroutines are suspendible execution blocks should make sense now. They are execution blocks which might suspend the calling block but won't block the thread from which they are called.

The code shown above is available on this github repo:
https://github.com/GauravChaddha1996/DeepDiveIntoCoroutine


Further reading

I've written posts on practical concepts for coroutines where I cover the concepts needed to use coroutines in actual projects, different use-cases that arise, whie showing the code in action to explain the concepts clearly.

  1. Practical concepts for Coroutines (Part 1)
  2. Practical concepts for Coroutines (Part 2)