Connect and retrieve data from databases
Kotlin Notebook provides support for the most common SQL databases:
With Kotlin DataFrame library, Kotlin Notebook can establish connections to databases, execute SQL queries, and import the results for further operations.
Before you start
Kotlin Notebook relies on the Kotlin Notebook plugin, which is bundled and enabled in IntelliJ IDEA by default.
If the Kotlin Notebook features are not available, ensure the plugin is enabled. For more information, see Set up an environment.
To follow this tutorial:
Create a new Kotlin Notebook.
Add the Java Database Connectivity (JDBC) driver dependency for your database in the first cell of your notebook.
For example, to connect to a MariaDB database, add:
USE { dependencies("org.mariadb.jdbc:mariadb-java-client:$version") }Import Kotlin DataFrame:
%use dataframe
Connect to a database
To connect to a database, create a connection configuration using the DbConnectionConfig() function:
Import the following functionality:
import org.jetbrains.kotlinx.dataframe.io.DbConnectionConfig import org.jetbrains.kotlinx.dataframe.schema.DataFrameSchemaDefine connection parameters (URL, username, password) using the
DbConnectionConfig()function:val URL = "YOUR_URL" val USER_NAME = "YOUR_USERNAME" val PASSWORD = "YOUR_PASSWORD" val dbConfig = DbConnectionConfig(URL, USER_NAME, PASSWORD)
Inspect database schema
Before loading the data, inspect the database schemas to understand what tables you have and what columns they contain. You can use the schemas to decide which table to load into a DataFrame.
To retrieve schemas for all user-created tables in your database, use the DataFrameSchema.readAllSqlTables() function:
Load data
After you inspect the database schemas and select the data, load the data into a DataFrame.
Kotlin DataFrame provides two ways to load data from a database:
Load data directly from a table.
Load the result of a custom SQL query.
Both approaches return a DataFrame that you can inspect, transform, and analyze in Kotlin Notebook.
Load data from a table
To load data from a table, use the DataFrame.readSqlTable() function.
The following example loads the first 100 rows from the movies table:
Load data with an SQL query
To execute a specific SQL query on your database, use the DataFrame.readSqlQuery() function. This approach is useful when you need to load specific columns, join tables, filter rows, or aggregate data in a database.
Let's retrieve a specific dataset related to movies directed by Quentin Tarantino. This query selects movie details and combines genres for each movie:
Process data
After loading your database into a DataFrame, you can use DataFrame operations to process retrieved data.
For example, let's manipulate data from the previous section. The following code:
Replaces missing values in the
yearcolumn using the.fillNA()function.Converts the column to
Intusing the.convert()function.Keeps only films released after 2000 using the
.filter()function.
Analyze data
Use Kotlin Notebook and DataFrame library to group, sort, and aggregate data so you can uncover and understand patterns in your data.
For example, let's read actor data from the actors table and find the 20 most common first names:
What's next
Explore data visualization using the Kandy library
Find additional information about data visualization in Data visualization in Kotlin Notebook with Kandy
For an extensive overview of tools and resources available for data science and analysis in Kotlin, see Kotlin and Java libraries for data analysis