How to work with different file types in Julia
Overview
While working in Julia, especially in data science, we often need to load data in a way that our computer is able to understand for processing, analysis, and modeling.
In this shot, we’ll learn how to work with different file types in Julia, specifically
As most data may come in CSV, let’s have a look at various ways we can load data from a CSV file into Julia.
Using Queryverse
Queryverse is one of the most versatile packages because it has various useful modules. Queryverse is a useful package for manipulating, reshaping, querying any type of data in Julia.
We use Pkg to import the Queryverse package.
We enter the right side square bracket ] on the command line.
This opens pkg. Next, we enter the below commands:
julia>]
(@v1.07) pkg>
(@v1.07) pkg> add Queryverse
We load our data into a Dataframe:
using Queryverse, DataFrames
df = DataFrame(load("mydata.csv"))
We can use the pipe operator as well:
using Queryverse, DataFrames
df = load("mydata.csv") |> DataFrame
In our jupyter notebooks, the above should apply, except we’ll need to import the modules first including Pkg.
import Pkg
Pkg.add("Queryverse")
using Queryverse, DataFrames
df = DataFrame(load("mydata.csv"))
Using CSVFiles and DataFrames
The other option we have is using CSVFiles and DataFrames packages in Julia.
The CSVFiles package supports load and save functions of CSV in Julia.
Next, we enter the below commands:
julia> ]
(@v1.07) pkg> add CSVFiles
---download messages---
(@v1.07) pkg>
import Pkg
Pkg.add("CSVFiles")
To use the packages and load a CSV file, we enter the below commands:
using CSVFiles, DataFrames
df = DataFrame(load("mydata.csv"))
Loading file types other than CSV using Queryverse
Queryverse supports a lot of file types and not just CSV. These include excel files, feather files, stat file formats, and more.
To load a file in SPSS, we enter the below commands:
using Queryverse
df=DataFrame(load("mydata.sav"))
Alternatively, we use the pipe operator:
using Queryverse
df=load("mydata.sav") |> DataFrame
This loads the SPSS file into a Dataframe.
Saving files in Julia.
We can use Queryverse to save the different files created while working on our code.
Syntax
using Queryverse
df = DataFrame(name=["Peter", "Emma"], salary=[1000,2000], department=["IT", "Data"])
df |> save("mydata.csv")
This will save df into our local machine as a CSV file mydata.
We can also use CSVFiles to save in a CSV format.
Using CSVFiles
df = DataFrame(name=["Peter", "Emma"], salary=[1000,2000], department=["IT", "Data"])
df |> save("output.csv")
The load and save functions accept a number of arguments when loading or saving a CSV file. For example, saving your file, including delimiters using delim, as well as including argument or headers using header to specify if our data has headers.