Any computer system in today’s world generates a very high amount of logs or data daily. As the system grows, it is not feasible to store the debugging data into a database, as they’ve are immutable and it’s only going to be used for analytics and fault resolution purposes. So organisations tend to store it in files, which resides in local disks storage.
We are going to extract logs from a .txt or .log file of size 16 GB, having millions of lines using Golang.
Let’s open the file first. We will be using standard Go os.File for any file IO.
f, err := os.Open(fileName)
if err != nil {
fmt.Println("cannot able to read the file", err)
return
}
// UPDATE: close after checking error
defer file.Close() //Do not forget to close the file
Once the file is opened, we have below two options to proceed with
As we are having file size too high, i.e 16 GB, we can’t load an entire file into memory. But the first option is also not feasible for us, as we want to process the file within seconds.
But guess what, there is a third option. Voila…! Instead on loading entire file into memory we will load the file in chunks, using **bufio.NewReader(), **available in Go.
r := bufio.NewReader(f)
for {
buf := make([]byte,4*1024) //the chunk size
n, err := r.Read(buf) //loading chunk into buffer
buf = buf[:n]
if n == 0 {
if err != nil {
fmt.Println(err)
break
}
if err == io.EOF {
break
}
return err
}
}
#go #processing #concurrency #golang