1620146182

There are many quality articles on how to become a software developer. They teach you to program, develop, and use libraries. But little has been done to educate in Algorithms and DataStructures. No matter how good you are at development, without knowledge of Algorithms and Data Structures, you can’t get hired.

Learning popular algorithms like Matrix Chain Multiplication, Knapsack or Travelling Salesman Algorithms is not sufficient. Interviewers ask problems like the ones you find on competitive programming sites. To solve such problems, you need to have a good and firm understanding of the concepts.

What is Dynamic Programming?

According to Wikipedia, dynamic programming is simplifying a complicated problem by breaking it down into simpler sub-problems in a recursive manner. This article will teach you to:

-> Identify subproblems

-> Learn how to solve subproblems

-> Identify that subproblems are repetitive

-> Identify that subproblems have optimal substructure property

-> Learn to cache/store results of sub problems

-> Develop a recursive relation to solve the problem

-> Use top-down and bottom-up approach to solve the problem

Which language will I use?

I know that most people are proficient or have experience coding in JavaScript. Also, once you learn in JavaScript, it is very easy to transform it into Java code. The same can be said of Python or C++. The trick is to understand the problems in the language you like the most. Hence I have chosen to use JavaScript.

This post is about algorithms and more specifically about dynamic programming. It is generally perceived as a tough topic. If you make it to the end of the post, I am sure you can tackle many dynamic programming problems on your own ?.

Problem Statement

Problem: Given an integer n, find the minimum number of steps to reach integer 1.

At each step, you can:

Subtract 1,

Divide by 2, if it is divisible by 2

Divide by 3, if it is divisible by 3

All Dynamic programming problems have a start state. You have to reach the goal by transitioning through a number of intermediate states. In a typical textbook, you will often hear the term subproblem. It is the same as a state. The terms can be used interchangeably. In this article, I will use the term state instead of the term subproblem.

What is a subproblem or state ? A subproblem/state is a smaller instance of the original problem. The methods used to solve the original problem and the subproblem are the same.

Some problems will give you the rules that specify the state transitions. This is one such problem. This problem says you can move to n-1, n/2 or n/3 starting from n. On the flip side, there are problems that will not specify the state transitions. You will have to figure them out by yourself. I will talk about these types of problems in another post.

Here,

Start state -> n

Goal -> 1

Intermediate states -> any integer number between 1 and n

Given a state (either start or intermediate), you can always move to a fixed number of states.

from n you can move to :

n -> n-1

if n % 2 == 0:

n -> n/2

if n % 3 == 0:

n -> n/3

example:

from 3 you can move to,

3 -> 3-1 = 2

3 -> 3/3 = 1

from 4 you can move to,

4 -> 4-1 = 3

4 -> 4/2 = 2

In a dynamic programming optimization problem, you have to determine moving though which states from start to goal will give you an optimal solution.

For n = 4:

approach one:

4 -> 3 -> 2 -> 1

approach two:

4 -> 2 -> 1

approach three:

4 -> 3 -> 1

Here, of the three approaches, approaches two and three are optimal, as they require smallest amount of moves/transitions. Approach one is the worst, as it requires more moves.

Textbook terminologies explained

Repetitive subproblems : You will end up solving the same problem more than once.

for n = 5

example:

5 -> 4 -> 3 -> 1

5 -> 4 -> 2 -> 1

5 -> 4 -> 3 -> 2 -> 1

observe here that 2 -> 1 occurs two times.

also observe that 5 -> 4 occurs three times.

Optimal Substructure : Optimal solutions to subproblems give optimal solution to the entire problem

example:

2 -> 1 is optimal

3 -> 1 is optimal

when I am at 4,

4 -> 3 -> 2 -> 1 and 4 -> 3 -> 1 is possible

but the optimal solution of 4 is 4 -> 3 -> 1. The optimal solution of four comes from optimal solution of three (3 -> 1).

similarly,

4 -> 3 -> 2 -> 1 and 4 -> 2 -> 1 is possible

but the optimal solution of 4 is 4 -> 2 -> 1. The optimal solution of four comes from optimal solution of two (2 -> 1).

now 5,

The optimal solution of 5 depends on optimal solution to 4.

5 -> 4 -> 2 -> 1 and 5 -> 4 -> 3 -> 1 are optimal.

How should you use Repetitive subproblems and Optimal Substructure to our advantage ?

We will solve the subproblems only once and solve each subproblem optimally.

we will solve the subproblems 3 -> 1 and 2 -> 1 only once and optimally.

Now for 4 we will solve only once by 4 -> 3 -> 1 and optimally. You can also solve as 4 -> 2 -> 1 but that is left to you.

Finally for 5 we will solve only once by 5 - > 4 -> 3 -> 1 and optimally.

In practice you will use an array to store the optimal result of a subproblem. This way when you have to solve the subproblem again, you can get the value from the array rather than solving it again. Essentially you are now solving a subproblem only once.

How to measure Optimality

By using something called cost. There is always a cost associated with moving from one state to another state. Cost may be zero or a finite number. The set of moves/transitions that give the optimal cost is the optimal solution.

In 5 -> 4 -> 3 -> 1

for 5 -> 4 cost is 1

for 4 -> 3 cost is 1

for 3 -> 1 cost is 1

The total cost of 5 -> 4 -> 3 -> 1 is the total sum of 3.

In In 5 -> 4 -> 3 -> 2 -> 1

for 5 -> 4 cost is 1

for 4 -> 3 cost is 1

for 3 -> 2 cost is 1

for 2 -> 1 cost is 1

The total cost of 5 -> 3 -> 2 -> 1 is the total sum of 4.

The optimal solution of 5 -> 4 -> 3 -> 1 has a cost of three which is the minimum. Hence we can see that optimal solutions have optimal costs

Recursive Relation: All dynamic programming problems have recursive relations. Once you define a recursive relation, the solution is merely translating it into code.

For the above problem, let us define minOne as the function that we will use to solve the problem and the cost of moving from one state to another as 1.

if n = 5,

solution to 5 is cost + solution to 4

recursive formulae/relation is

minOne(5) = 1 + minOne(4)

Similarly,

if n = 6,

recursive formulae/relation is

minOne(6) = min(

1 + minOne(5),

1 + minOne(3),

1 + minOne(2) )

Code

Dynamic programming problems can be solved by a top down approach or a bottom up approach.

Top Down : Solve problems recursively.

for n = 5, you will solve/start from 5, that is from the top of the problem.

It is a relatively easy approach provided you have a firm grasp on recursion. I say that this approach is easy as this method is as simple as transforming your recursive relation into code.

Bottom Up : Solve problems iteratively.

for n = 5, you will solve/start from 1, that is from the bottom of the problem.

This approach uses a for loop. It does not lead to stack overflow as in recursion. This approach is also slightly more optimal.

Which approach is better?

It is up to your comfort. Both give the same solutions. In very large problems, bottom up is beneficial as it does not lead to stack overflow. If you choose a input of 10000, the top-down approach will give maximum call stack size exceeded, but a bottom-up approach will give you the solution.

But do remember that you cannot eliminate recursive thinking completely. You will always have to define a recursive relation irrespective of the approach you use.

Bottom-Up approach

/*

Problem: Given an integer n, find the minimum number of steps to reach integer 1.

At each step, you can:

Subtract 1,

Divide by 2, if it is divisible by 2

Divide by 3, if it is divisible by 2

*/

// bottom-up

function minOneBottomUp(n) {

```
const cache = [];
// base condition
cache[1] = 0;
for (i = 2; i <= n; i++) {
// initialize a , b and c to some very large numbers
let a = 1000, b = 1000, c = 1000;
// one step from i -> i-1
a = 1 + cache[i - 1];
// one step from i -> i/2 if i is divisible by 2
if (i % 2 === 0) {
b = 1 + cache[i / 2];
}
// one step from i -> i/3 if i is divisible by 3
if (i % 3 === 0) {
c = 1 + cache[i / 3];
}
// Store the minimum number of steps to reach i
cache[i] = Math.min(a, b, c);
}
// return the number minimum number of steps to reach n
return cache[n];
```

}

console.log(minOneBottomUp(1000));

Line 11 : The function that will solve the problem is named as minOneBottomUp. It takes n as the input.

Line 13 : The array that will be used to store results of every solved state so that there is no repeated computation is named cache. Some people like to call the array dp instead of cache. In general, cache[i] is interpreted as the minimum number of steps to reach 1 starting from i.

Line 15 : cache[1] = 0 This is the base condition. It says that minimum number of steps to reach 1 starting from 1 is 0.

Line 17 - 37 : For loop to fill up the cache with all states from 1 to n inclusive.

Line 20 : Initialize variables a, b and c to some large number. Here a represents minimum number of steps. If I did the operation n-1, b represents the minimum number of steps. If I did the operation n/2, c represents the minimum number of steps. If I did the operation n/3. The initial values of a, b and c depends upon the size of the problem.

Line 23 : a = 1 + cache[i-1]. This follows from the recursive relation we defined earlier.

Line 26 - 28: if(i % 2 == 0){

b = 1 + cache[i/2];

}

This follows from the recursive relation we defined earlier.

Line 31 - 33: if(i % 3== 0){

c= 1 + cache[i/3];

}

This follows from the recursive relation we defined earlier.

Line 36 : This the most important step.

cache[i] = Math.min(a, b, c). This essentially determines and stores which of a, b and c gave the minimum number of steps.

Line 40 : All the computations are completed. Minimum steps for all states from 1 to n is calculated. I return cache[n](minimum number of steps to reach 1 starting from n) which is the answer we wanted.

Line 43 : Testing code. It returns a value of 9

Top-Down approach

/*

Problem: Given an integer n, find the minimum number of steps to reach integer 1.

At each step, you can:

Subtract 1,

Divide by 2, if it is divisible by 2

Divide by 3, if it is divisible by 2

*/

// top-down

function minOne(n, cache) {

```
// if the array value at n is not undefined, return the value at that index
// This is the heart of dynamic programming
if (typeof (cache[n]) !== 'undefined') {
return cache[n];
}
// if n has reached 1 return 0
// terminating/base condition
if (n <= 1) {
return 0;
}
// initialize a , b and c to some very large numbers
let a = 1000, b = 1000, c = 1000;
// one step from n -> n-1
a = 1 + minOne(n - 1, cache);
// one step from n -> n/2 if n is divisible by 2
if (n % 2 === 0) {
b = 1 + minOne(n / 2, cache);
}
// one step from n -> n/3 if n is divisible by 3
if (n % 3 === 0) {
c = 1 + minOne(n / 3, cache);
}
// Store the minimum number of steps to reach n
return cache[n] = Math.min(a, b, c);
```

}

const cache = [];

console.log(minOne(1000, cache));

Line 11 : The function that will solve the problem is named as minOne. It takes n and cache as the inputs.

Line 15 - 16 : It checks if for a particular state the solution has been computed or not. If it is computed it returns the previously computed value. This is the top-down way of not doing repeated computation.

Line 21 - 23 : It is the base condition. It says that if n is 1 , the minimum number of steps is 0.

Line 26 : Initialize variables a, b and c to some large number. Here a represents minimum number of steps if I did the operation n-1, b represents the minimum number of steps if I did the operation n/2 and c represents the minimum number of steps if I did the operation n/3. The initial values of a, b and c depends upon the size of the problem.

Line 29 : a = 1 + minOne(n-1, cache). This follows from the recursive relation we defined earlier.

Line 32 - 34 : if(n % 2 == 0){

b = 1 + minOne(n/2, cache);

}

This follows from the recursive relation we defined earlier.

Line 37 - 39 : if(n % 3== 0){

c = 1 + minOne(n/3, cache);

}

This follows from the recursive relation we defined earlier.

Line 42 : return cache[n] = Math.min(a, b, c) . This essentially determines and stores which of a, b and c gave the minimum number of steps.

Line 48 - 49 : Testing code. It returns a value of 9

Time Complexity

In Dynamic programming problems, Time Complexity is the number of unique states/subproblems * time taken per state.

In this problem, for a given n, there are n unique states/subproblems. For convenience, each state is said to be solved in a constant time. Hence the time complexity is O(n * 1).

This can be easily cross verified by the for loop we used in the bottom-up approach. We see that we use only one for loop to solve the problem. Hence the time complexity is O(n ) or linear.

This is the power of dynamic programming. It allows such complex problems to be solved efficiently.

Space Complexity

We use one array called cache to store the results of n states. Hence the size of the array is n. Therefore the space complexity is O(n).

DP as Space-Time tradeoff

Dynamic programming makes use of space to solve a problem faster. In this problem, we are using O(n) space to solve the problem in O(n) time. Hence we trade space for speed/time. Therefore it’s aptly called the Space-Time tradeoff.

Wrapping up

I hope this post demystifies dynamic programming. I understand that reading through the entire post might’ve been painful and tough, but dynamic programming is a tough topic. Mastering it requires a lot of practice.

I will publish more articles on demystifying different types of dynamic programming problems. I will also publish a article on how to transform a backtracking solution into a dynamic programming solution.

If you like this post, please support by clapping ?(you could go up to 50) and follow me here on Medium ✌️. You can connect with me on LinkedIn . You can also follow me on Github.

1620466520

If you accumulate data on which you base your decision-making as an organization, you should probably think about your data architecture and possible best practices.

If you accumulate data on which you base your decision-making as an organization, you most probably need to think about your data architecture and consider possible best practices. Gaining a competitive edge, remaining customer-centric to the greatest extent possible, and streamlining processes to get on-the-button outcomes can all be traced back to an organization’s capacity to build a future-ready data architecture.

In what follows, we offer a short overview of the overarching capabilities of data architecture. These include user-centricity, elasticity, robustness, and the capacity to ensure the seamless flow of data at all times. Added to these are automation enablement, plus security and data governance considerations. These points from our checklist for what we perceive to be an anticipatory analytics ecosystem.

#big data #data science #big data analytics #data analysis #data architecture #data transformation #data platform #data strategy #cloud data platform #data acquisition

1620629020

The opportunities big data offers also come with very real challenges that many organizations are facing today. Often, it’s finding the most cost-effective, scalable way to store and process boundless volumes of data in multiple formats that come from a growing number of sources. Then organizations need the analytical capabilities and flexibility to turn this data into insights that can meet their specific business objectives.

This Refcard dives into how a data lake helps tackle these challenges at both ends — from its enhanced architecture that’s designed for efficient data ingestion, storage, and management to its advanced analytics functionality and performance flexibility. You’ll also explore key benefits and common use cases.

As technology continues to evolve with new data sources, such as IoT sensors and social media churning out large volumes of data, there has never been a better time to discuss the possibilities and challenges of managing such data for varying analytical insights. In this Refcard, we dig deep into how data lakes solve the problem of storing and processing enormous amounts of data. While doing so, we also explore the benefits of data lakes, their use cases, and how they differ from data warehouses (DWHs).

*This is a preview of the Getting Started With Data Lakes Refcard. To read the entire Refcard, please download the PDF from the link above.*

#big data #data analytics #data analysis #business analytics #data warehouse #data storage #data lake #data lake architecture #data lake governance #data lake management

1621986060

If I ask you what is your morning routine, what will you answer? Let me answer it for you. You will wake up in the morning, freshen up, you’ll go for some exercise, come back, bath, have breakfast, and then you’ll get ready for the rest of your day.

If you observe closely these are a set of rules that you follow daily to get ready for your work or classes. If you skip even one step, you will not achieve your task, which is getting ready for the day.

These steps do not contain the details like, at what time you wake up or which toothpaste did you use or did you go for a walk or to the gym, or what did you have in your breakfast. But all they do contain are some basic fundamental steps that you need to execute to perform some task. This is a very basic example of algorithms. This is an algorithm for your everyday morning.

In this article, we will be learning algorithms, their characteristics, types of algorithms, and most important the complexity of algorithms.

Algorithms are a finite set of rules that must be followed for problem-solving operations. Algorithms are step-by-step guides to how the execution of a process or a program is done on a machine to get the expected output.

- Do not contain complete programs or details. They are just logical solutions to a problem.
- Algorithms are expressible in simple language or flowchart.

No one would follow any written instructions to follow a daily morning routine. Similarly, you cannot follow anything available in writing and consider it as an algorithm. To consider some instructions as an algorithm, they must have some specific characteristics :

**1. Input:** An algorithm, if required, should have very well-defined inputs. An algorithm can have zero or more inputs.

**2. Output:** Every algorithm should have one or more very well-defined outputs. Without an output, the algorithm fails to give the result of the tasks performed.

**3. Unambiguous:** The algorithm should be unambiguous and it should not have any confusion under any circumstances. All the sentences and steps should be clear and must have only one meaning.

**4. Finiteness:** The steps in the algorithm must be finite and there should be no infinite loops or steps in the algorithm. In simple words, an algorithm should always end.

**5. Effectiveness:** An algorithm should be simple, practically possible, and easy to understand for all users. It should be executable upon the available resources and should not contain any kind of futuristic technology or imagination.

**6. Language independent:** An algorithm must be in plain language so that it can be easily implemented in any computer language and yet the output should be the same as expected.

**1. Problem:** To write a solution you need to first identify the problem. The problem can be an example of the real-world for which we need to create a set of instructions to solve it.

**2. Algorithm:** Design a step-by-step procedure for the above problem and this procedure, after satisfying all the characteristics mentioned above, is an algorithm.

**3. Input:** After creating the algorithm, we need to give the required input. There can be zero or more inputs in an algorithm.

**4. Processing unit:** The input is now forwarded to the processing unit and this processing unit will produce the desired result according to the algorithm.

**5. Output:** The desired or expected output of the program according to the algorithm.

Suppose you want to cook chole ( or chickpeas) for lunch. Now you cannot just go to the kitchen and set utensils on gas and start cooking them. You must have soaked them for at least 12 hours before cooking, then chop desired vegetables and follow many steps after that to get the delicious taste, texture, and nutrition.

This is the need for algorithms. To get desired output, you need to follow some specific set of rules. These rules do not contain details like in the above example, which masala you are using or which salt you are using, or how many chickpeas you are soaking. But all these rules contain a basic step-by-step guide for best results.

We need algorithms for the following two reasons :

**1. Performance:** The result should be as expected. You can break the large problems into smaller problems and solve each one of them to get the desired result. This also shows that the problem is feasible.

**2. Scalability:** When you have a big problem or a similar kind of smaller problem, the algorithm should work and give the desired output for both problems. In our example, no matter how many people you have for lunch the same algorithm of cooking chickpeas will work every single time if followed correctly.

Let us try to write an algorithm for our lunch problem :

1. Soak chickpeas in the night so that they are ready till the next afternoon.

2. Chop some vegetables that you like.

3. Set up a utensil on gas and saute the chopped vegetables.

4. Add water and wait for boiling.

5. Add chickpeas and wait until you get the desired texture.

6. Chickpeas are now ready for your lunch.

The real-world example that we just discussed is a very close example of the algorithm. You cannot just start with step 3 and start cooking. You will not get the desired result. To get the desired result, you need to follow the specific order of rules. Also, each instruction should be clear in an algorithm as we can see in the above example.

#algorithms in data structure #data structure algorithms #algorithms

1617959340

Companies across every industry rely on big data to make strategic decisions about their business, which is why data analyst roles are constantly in demand. Even as we transition to more automated data collection systems, data analysts remain a crucial piece in the data puzzle. Not only do they build the systems that extract and organize data, but they also make sense of it –– identifying patterns, trends, and formulating actionable insights.

If you think that an entry-level data analyst role might be right for you, you might be wondering what to focus on in the first 90 days on the job. What skills should you have going in and what should you focus on developing in order to advance in this career path?

Let’s take a look at the most important things you need to know.

#data #data-analytics #data-science #data-analysis #big-data-analytics #data-privacy #data-structures #good-company

1621103940

Continuing on the Quick Revision of Important Questions for My Interviews. These Are Good Puzzles or Questions Related to Data Structures.

*My Article Series on Algorithms and Data Structures in a Sort of ‘Programming Language Agnostic Way’. Few of the Algorithms and Data Structures in C, Few in C++, and Others in Core Java. Assorted Collection for Learning, Revising, Revisiting, Quick Refresh, and a Quick Glance for Interviews. You May Even Include them Directly for Professional or Open Source Efforts. Have Included Explanation Only for Few of These! Hope these turn out to be Really Helpful as per the Author’s Intention.*

#java #core java #data structures #dijkstra #core java basics #data structure using java #algorithms and data structures #java code examples #linked list in java #circular linked list