A look at Java through the lens of a simple Vector class

It’s really easy for me to take linear algebra for granted when doing data science. Whether I’m overlooking the specifics of eigendecomposition in Principal Component Analysis, completely forgetting that every weighted sum is actually a dot product between a vector of inputs and a vector of coefficients, or sticking my fingers in my ears to avoid hearing the “tensor” in TensorFlow, I feel like I’ve been shortchanging my degree in math by not getting into the weeds and looking at the implementations.

Although there’s nothing in the definition of a vector space that requires vectors to be represented as lists of numbers, we’ll be focusing on vectors that do take this form. We’re also going to focus only on vector operations involving vectors and scalars––no matrices yet. They’re a can of worms that will not fit in this post.

Before we get going, just a few notes. First, to keep the code as accessible as possible (I’m assuming many people reading this are coming from Python), I’ve called static methods using the notation Vector.method(). This is not strictly necessary, but if you are new to Java, it might help you distinguish between static methods (called from the .class file) and instance methods (called from an object that exists in memory). Second, when talking about vectors as mathematical objects, I will represent them as a bold lowercase letter, e.g. b. Third, when talking about Vector objects in our program, I’ll put the name of the vector in monospace font (so it looks like code).

Finally, I’ll be removing documentation from code examples in this post, in order to keep things readable. If you would like to see the documentation, or you’re interested in skipping straight to the code and avoiding my dazzling commentary on how Java differs from Python, feel free to head over to my github to access the code directly.

Getting started: What data do we need to store?

As discussed above, the vectors we’ll be working with are simply arrays of real numbers, so a skeleton of our Vector class can be seen below:

public class Vector {
    private double[] v;

    // methods not shown
}

Although they have some similarities in terms of syntax for accessing indexed elements, arrays in Java and lists in Python behave in very different ways. Arrays are fixed-length––to append to delete elements, you’d need to use something like an ArrayList . This means our Vector objects are locked into whatever length they’re initialized with, unless we replace the array with a new one of a different length.

Another way this code differs from Python is that keyword private before the variable type. There are ways to hide data in Python, like using the @property annotation on a method with the name of the variable we want to hide), but hiding your instance variables is a core feature that makes Java, Java. To avoid code unintentionally changing the array of numbers scored in the bowels of our Vector object, we make it private , and this means v can only be accessed in ways that we deem to be acceptable.

Now that we have a skeleton to store data, we’ll need to have some way to create a Vector object. The following constructor will do the trick:

public Vector(double ... v) {
    this.v = new double[v.length];
       for (int i = 0; i < v.length; i++) {
          this.v[i] = v[i];      
    }
}

#linear-algebra #mathematics #java #data-science #programming

Programming Linear Algebra in Java: Vector Operations
1.50 GEEK