The mystery of recursion
A very weird concept to grasp at first, but once you understand it,you will gain an incredible skill and perspective on how to solve certain problems.don't worry if you don't get the joke above,i am certain you will get it by the end of this read.
I struggled with recursion for months before i got it.the first time i saw it used was to solve a Sudoku game.now you would think that if you were to write a program to solve Sudoku it would involve some complex logic,a lot of loops and if statements.i tried to code the logic, but it got way too complex,the computer cant think like us,it doesn't have intuition,it just follows a series of instructions if u exclude machine learning of course.so here i was,having written almost 200 lines of python code trying to hard code how a human would solve a Sudoku.it worked for a few positions but there were a lot edge cases, and to actually do the entire thing would require really complex logic or to brute force every possible combination of numbers and choose solutions which worked.
Surely it can't be this hard to solve a Sudoku game.and it turns out its not .recursion could do it in 10 lines of code 😲,you can imagine my shock,never had i ever been more clueless than when i looked at those 10 lines of code,i didn't even touch my computer for the rest of that day.
I understood what each line meant,it wasn't a syntactic issue,it just seemed like it was doing things it wasn't told to do.how does 10 lines of code solve any Sudoku game.answers had to be found.so i did some research by watching a lot of YouTube videos and filling my browser with tabs on recursion and honestly it still didn't make sense.I started considering that maybe i am just an idiot.
Now I know why I did not understand it.It was a problem of thought process,we are so used to follow casualty laws, but we can only follow a chain of events up to a certain extent,after that we loose track or get confused..I was trying to follow each line of code which is what i was used to,I realized that wouldn't work because it got way too complicated.
A new way of thinking was fundamentally needed.So i decided that i will trust the code because it works.I don't have to follow it to its conclusion,i just had to understand that it would reach a certain conclusion based on how it was setup.and that was enough.i think its called abstraction in programming where you don't have to know exactly how a thing works but you only need to know what it is used for and how it is used ,and that is how you have to start thinking about recursion.
don't worry about following the logic, understand the big idea and that's all you need.
WHAT IS THE BIG IDEA?
The big idea is stacks if you understand stacks then recursion then recursion,
then recursion is easy to grasp
what is a stack?
A stack is a LIFO (last in first out) data structure,where the top item is removed first.
 |
delicious stack
|
Look at the pancakes above,though they look delicious,their purpose here is to teach us about stacks.as you can see you can only remove the top pancake first ,and don't say well."gee Alex,I could always remove a middle one if i wanted". That type of behavior is not tolerated here,unless of course a bigger pancake is sitting in the middle,then u can go right ahead and take it.
There are data structures that allow you to remove an item from any where you want like lists. in fact a stack is a list with constraints in which you can only remove the last item from the list.
Lets assume we can only remove the top pancake.so initially the plate is empty, then u add one pancake,then another on top, then another,once u have placed all the pancakes that are available.then the removal process starts,the top one is removed and so on until all pancakes are removed.that's how a recursive function works.actually that's how all functions work,. in the case of a normal function the stack only has one pancake.the function is placed in a stack then after it returns it is removed from the stack .the pancake has its own properties and is unique.this is true for all functions as well ,they have what is called their own environment with its own unique attributes and variables and those variables are only accessible within that function only(Local scope).
What is a recursive function?
a recursive function is a function that calls itself,what does that mean?
it means that inside the function itself the same function is executed (to call a function means to execute or run a function)
This might sound crazy,the function is calling itself.
 |
im you
|
lets take a look at an example with python:
on terminal
type:python
a python interpreter should start,if it doesn't type python3,if it still doesn't work then
python is most likely not installed you can install it or use an online interpreter like
this python interpreter
type this in console or the online editor and click run
def foo():
print('hey') foo()
foo()
what do u think running this function will do?
i would naively say it would print 'hey' forever,but it doesn't.
it seems to run,then give an error.
 |
stack overflow error
|
How can a program run successfully then suddenly stop working.if you look at the error it says maximum recursion depth reached.this means that the call stack got way to big and the computer ran out of stack memory The limit that most computer have on recursive calls is 999,if your function calls itself more times than that the computer stops the function and you get a stack overflow error,not to be confused with the stack overflow website although it was named after this particular error.
if you want to know more about stack overflow error,visit the link below
stack overflow explained
It seems that recursion isn't at all that useful since it seems if we run a recursive call,it will always give us a stack overflow error.but what if we could some how work around that. how useful would it be,that function ran 998 times before crashing and we only wrote four lines of code,you can imagine the possibilities in problems that require repetitive tasks.
whats the workaround stack overflow error?
base cases
The solution is adding base cases to the function,this are conditions that when fulfilled the function stops calling itself and starts returning
lets see an example using our previous code:
run this in interpreter
def foo(n):
if n>10:
return n
print(foo(n+1))
return n
foo(1)
the base case here is n>10,when this condition evaluates to
true,the function doesn't will return a value and
will not reach the point where it has to call itself again.
the base case is a condition in which when fulfilled
the recursive calls stop.
the stack would look like this
the functions pile up since each function is calling itself again and again.
when we reach the base case where n>10,the condition evaluates to true .
that causes the function that sees its input as 11 to return 11 instead of calling itself again.
What happens is that the computer executes what is on top of the stack.
so when the function on top of the pile returns it has been executed and thus is removed from the stack .now the function below it is now on top of the stack, so its executed then removed from the stack.this goes on until the stack is empty
.
if you ran the code you would see exactly this behavior.the first value to be printed would be what was returned by the function on top of the stack
Lets look at a famous example on recursion which is calculating the factorial of a number. a factorial is the number multiplied by all of the numbers below it.
if we are to solve his recursively or any problem,always ask yourself what are the base cases?
for a factorial we know the factorial of factorial of 1 is 1,we can use this as our base case. so when when n is equal to 1 we return 1.
def factorial(n):
#we added this case to deal with factorial(0)
if n==0:
return 0
if n==1:
return 1
this code finds the factorial of any number.
lets look at the call stack for this
Notice that to calculate the factorial of 5 ,the program needs to know the factorial of 4,to calculate the factorial of 4, it needs to know the factorial of 3 and so on.
when it reaches factorial(1) it returns 1 and removed from the top of the stack.
now the factorial of 2 can find a solution because it needed to know the answer of factorial(1) and now it knows its 1 so it will multiply 2*1.and returns 2 and removed from top of the stack
factorial(3) needed to know the answer of factorial(2) which it now knows its 2 so it does 3*2 and returns 6 and it is removed from the top of the stack.
factorial(4) needed to know the answer to factorial(3)and now it knows its 6 so it multiplies 4*6 and returns 24 and is removed from the top of the stack.
factorial(5) needed to know the answer to factorial(4) and now it knows its 24 so it
multiplies 5*24 and returns 120.now the stack is empty and the function is done.
lets take another example.
here is where trying to follow the code fails catastrophically
finding the nth Fibonacci number
def fibo(n):
if n<2:
return n
return fibo(n-1)+fibo(n-2)
print(fibo(5))
lets see what the computer is doing here.
the image shows the return values of each function in the stack
Its not as straight forward is it?,its still a stack but a slightly complex one.
here you can see what each function call returns (the numbers on top of the boxes) .Each function returns the addition result of two more function calls unless a base case is reached where it just returns n.
the reason we return n when n<2 is because the Fibonacci value is the same as n
where fibo(1)=1 and fibo(0)= 0.
This approach to finding the nth Fibonacci number is not optimized.there is a lot of repetition as you can see.the algorithm can be optimized by several methods including caching certain calculations and using tail recursion.
summary
Wow really informative , recursion is as powerful as is efficient.
ReplyDelete