a shot of dev knowledge

RELATED TAGS

How to compress a file in Golang

Like most programming languages, Go supports file compression using a number of in-built packages. Some of these packages are:

  1. bufio: This package is used for buffering input and output.

  2. compress/gzip: The purpose of the gzip package is to implement reading and writing of the gzip format compressed file.

  3. io/ioutil: The io/util provides I/O utility functions.

  4. os: This is a platform independent interface for operating system functionality. This gives us the ability to perform operations related to the operating system, such as manipulating files and directories.

  5. strings: The strings package provides different functions to manipulate UTF-8 encoded strings.

  6. fmt: The fmt package implements formatted I/O which is similar to C’s print and scan as it collects user input from the console.

In this shot, we have an example that is going to show how file compression works in Go. Below are steps to follow to achieve compression.

1. Open a file

The first thing you want to do is open the file with which you want to work. For this, we will use the os.Open function.

os.Open opens a file for reading. It takes in the path of the file as an input. If successful, methods of the returned function can be used for the reading.

file_name := "demo.txt"
file, err := os.Open("filePath/" + file_name)

2. Read the contents

Now we are going to read all the bytes from the file we just opened. To do this, we are going to need the bufio.NewReader() and the ioutil.ReadAll() method.

bufio.NewReader()

func NewReader(rd io.Reader) *Reader 

The bufio NewReader method takes in an io.Reader and returns a new Reader whose buffer has a default size.

ioutil.ReadAll()

           func ReadAll(r io.Reader)([]byte, error)

The ioutil's Readall method is useful for reading all data from a io. If the operation is successful it returns a slice of byte. However when it’s not, it returns an error.

reader := buffio.NewReader(file)
data, err := ioutil.ReadAll(reader)

3. Create a file with a .gz extension

We replace the .txt file extension with the .gz extension, with the aid of the strings.Replace method. Now after changing the file extension, we create a new .gz file using the file name along with its new extension. We create this file using os.Create method.

strings.Replace()

func Replace (s, old, new string, n int) string

Replace returns a copy of the string s, with the first n. This is the first instance of non-overlapping old replaced by new. Here, s is the original and old is the string you want to replace. New is what replaces the old string and n is the number of times the old string is replaced.

os.Create()

func Create (name string) (*File, error)

Create makes or truncates the specified file. If the file already exists, it will be truncated. If the file does not exist, it’s created in mode 0666 (before umask). If successful, you can use the methods in the returned file for I/O. The associated file descriptor has O_RDWR mode. If there is an error, it will be of type *PathError.

file_name = strings.Replace(file_name, ".txt", ".gz", -1)
file, err = os.Create("filePath/" + file_name)

4. Copy all read bytes into the new file and close the file

By using gzip.NewWriter() from the compress/gzip package, we create a new gzip writer. Then we can use Write to write the compressed bytes in the file. After finishing, we need to close the file.

gzip.NewWriter()

func NewWriter(w *io.writer) *Writer

NewWriter returns a new writer.

Write()

func (z *Writer) Write (p []byte) (int, error)

Write writes a compressed p to the underlying io.Writer.

w := gzip.NewWriter(file)
w.Write
w.Close()

Code

The full code example is shown below.

package main

import (
	"bufio"
	"compress/gzip"
	"fmt"
	"io/ioutil"
	"os"
	"strings"
)
// checks for error
func ErrorChecker(err error)  {
	if err != nil{
		log.Fatal(err)
	}
}

func main()  {
	
	file_name := "demo.txt"
	file, er := os.Open("filePath/" + file_name)

	//here we check if there was an error
	ErrorChecker(er)	

	read := bufio.NewReader(file)
	
	data, err := ioutil.ReadAll(read)
	ErrorChecker(err)

	file_name = strings.Replace(file_name, ".txt", ".gz", -1)

	file, err = os.Create("filePath/" + file_name)
	ErrorChecker(err)
	
	w := gzip.NewWriter(file)
	w.Write(data)
  // gives a notification when file compression is done
	fmt.Println("File compressed successfully")

	w.Close()
}

Explanation

  • Line 20: To open the file in disk, we enter the file name in quotes.

  • Line 21: Here, we trace the file path and call the os.Open method which helps us open the file. This method takes in filePath as input.

  • Line 26: Now we read the bytes of the document we opened.

  • Line 28: Here, we use the ReadAll method to get all the bytes that have been read.

  • Line 31: With the help of the Replace method, we can replace .txt file with the .gz extension.

  • Line 33: We use the os.Create method to store the information of the .gz file extension.

  • Line 36: Here we use NewWriter to copy all the compressed data.

  • Line 37: With the aid of the Write method, we write all the bytes in the data variable from the original file.

  • Line 41: Close the file.

Result

Here, we have the uncompressed the .txt file and compressed the .gz file. Notice the size of both files at the top right of the images. demo.txt which is the original file, occupies more disk space than the demo.gz file. This means our code ran successfully and the file is compressed.

Uncompressed .txt file
Uncompressed .txt file
Compressed .gz file
Compressed .gz file

Conclusion

The packages used in this shot to accomplish file compression demonstrate how Go standard libraries can help developers build powerful tools.

RELATED TAGS

RELATED COURSES

View all Courses

Keep Exploring