krangl
in 3 Minutes
Welcome to krangl
. Relational data and how to handle it properly is a huge topic, but the core concepts are relatively simple. So let's get started!
Columns and Rows
DataFrames are just tables with type constraints within each column. To glance into them horizontally and vertically we can do
irisData.print(maxRows=10)
irisData.schema()
irisData
is bundled with krangl, and gives the measurements in centimeters of the variables sepal length and width and petal length and width, respectively, for 50 flowers from each of 3 species of iris. The species are Iris setosa, versicolor, and virginica.
Columns and Rows can be accessed using krangl
with
val col = irisData["Species"]
val cell = irisData["Species"][1]
Get your data into krangl
To save a data frame simply use
irisData.writeCSV(File("my_iris.txt"))
To load a data-frame simply {done}
irisData.writeCSV(File("my_iris.txt"))
It allows to Read from tsv, csv, json, jdbc, e.g.
val tornados = DataFrame.readCSV(pathAsStringFileOrUrl)
tornados.writeCSV(File("tornados.txt.gz"))
krangl
will guess column types unless the user provides a column type model.
You can also simply define new data-frames in place
val users : DataFrame = dataFrameOf(
"firstName", "lastName", "age", "hasSudo")(
"max", "smith" , 53, false,
"eva", "miller", 23, true,
null , "meyer" , 23, null
)
Note
krangl
also allows to convert any iterable into a data-frame via reflection. See the section about Reshaping Data for details.
Other input formats
krangl
also allows to read in json array data. For a complete overview see JsonIO
val df = fromJson("my.json")
val df2 = fromJson("http://foo.bar/my.json")